https://hcl.ucd.ie/wiki/api.php?action=feedcontributions&user=Zhongziming&feedformat=atomHCL - User contributions [en]2024-03-28T21:46:33ZUser contributionsMediaWiki 1.27.1https://hcl.ucd.ie/wiki/index.php?title=UTK_multicores_%2B_GPU&diff=772UTK multicores + GPU2012-08-22T22:32:58Z<p>Zhongziming: </p>
<hr />
<div>== List of machines ==<br />
http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180<br />
<br />
== Display a list of available GPUs ==<br />
$ nvidia-smi -L<br />
<br />
== Using Fupermod on hybrid multicore/GPUs node ==<br />
* Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl) and GPU cblas (e.g. cublas). <br />
<br />
- For example: configuring with [http://developer.amd.com/libraries/acml/pages/default.aspx acml] for CPU and [http://developer.nvidia.com/cublas cublas] for GPU<br />
<br />
$ cd fupermod/<br />
<br />
$ mkdir acml_config <br />
$ cd acml_config<br />
$ ../configure --with-blas=acml<br />
$ make<br />
<br />
$ mkdir cuda_config <br />
$ cd cuda_config<br />
$ ../configure --with-blas=cuda<br />
$ make<br />
<br />
* Building performance model:<br />
<br />
- Rankfile is for [http://www.open-mpi.org/doc/v1.6/man1/mpirun.1.php#sect8 process binding], and appfile tells mpirun what programs to launch<br />
$ mpirun -rf rankfile -app appfile_fpm<br />
<br />
<br />
- Example of a rankfile:<br />
rank 0=ig.icl.utk.edu slot=0:0<br />
rank 1=ig.icl.utk.edu slot=0:1<br />
...<br />
<br />
- Example of an appfile for building functional permanence model (appfile_fpm):<br />
# GPU<br />
# e.g. Linking against cublas, and fupermod is configured under cublas_config<br />
# suboption g=0 means device 0 is selected for computing<br />
-host localhost -np 1 $HOME/fupermod/cublas_config/tools/builder -l $HOME/fupermod/cublas_config/routines/mxm/.libs/libmxm_1d.so -o k=640,g=0 -U10000 -s10<br />
# CPU<br />
# e.g. Linking against acml, and fupermod is configured under acml_config<br />
-host localhost -np 47 $HOME/fupermod/acml_config/tools/builder -l $HOME/fupermod/acml_config/routines/mxm/.libs/libmxm_1d.so -o k=640 -U10000 -s10<br />
<br />
* Data partitioning<br />
<br />
- Matrix size D = N x N, and machinefile lists the nodes participating in the computing<br />
<br />
$ fupermod/tools/partitioner -l fupermod/routines/mxm/.libs/libmxm_1d.so -D10000 -o N=100 -m machinefile<br />
<br />
* Running matrix multiplication<br />
<br />
$ mpirun -rf rankfile -app appfile_mxm<br />
<br />
- Example of an appfile for matrix multiplication (appfile_mxm)<br />
# GPU<br />
# Assuming fupermod is configured under cublas_config, linking against cublas<br />
# -g0 means device 0 is selected for computing<br />
-host localhost -np 1 $HOME/fupermod/cublas_config/routines/mxm/mxm_2d -k640 -g0 -m machinefile<br />
# CPU<br />
# Assuming fupermod is configured under acml_config, linking against acml<br />
-host localhost -np 47 $HOME/fupermod/acml_config/routines/mxm/mxm_2d -k640 -m machinefile</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=UTK_multicores_%2B_GPU&diff=771UTK multicores + GPU2012-08-22T11:36:32Z<p>Zhongziming: </p>
<hr />
<div>== List of machines ==<br />
http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180<br />
<br />
== Display a list of available GPUs ==<br />
$ nvidia-smi -L<br />
<br />
== Using Fupermod on hybrid multicore/GPUs node ==<br />
* Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl) and GPU cblas (e.g. cublas). <br />
<br />
- For example: configuring with [http://developer.amd.com/libraries/acml/pages/default.aspx acml] for CPU and [http://developer.nvidia.com/cublas cublas] for GPU<br />
<br />
$ cd fupermod/<br />
<br />
$ mkdir acml_config <br />
$ cd acml_config<br />
$ ../configure --with-cblas=acml<br />
$ make<br />
<br />
$ mkdir cuda_config <br />
$ cd cuda_config<br />
$ ../configure --with-cblas=cuda<br />
$ make<br />
<br />
* Building performance model:<br />
<br />
- Rankfile is for [http://www.open-mpi.org/doc/v1.6/man1/mpirun.1.php#sect8 process binding], and appfile tells mpirun what programs to launch<br />
$ mpirun -rf rankfile -app appfile_fpm<br />
<br />
<br />
- Example of a rankfile:<br />
rank 0=ig.icl.utk.edu slot=0:0<br />
rank 1=ig.icl.utk.edu slot=0:1<br />
...<br />
<br />
- Example of an appfile for building functional permanence model (appfile_fpm):<br />
# GPU<br />
# e.g. Linking against cublas, and fupermod is configured under cublas_config<br />
# suboption g=0 means device 0 is selected for computing<br />
-host localhost -np 1 $HOME/fupermod/cublas_config/tools/builder -l $HOME/fupermod/cublas_config/routines/mxm/.libs/libmxm_1d.so -o k=640,g=0 -U10000 -s10<br />
# CPU<br />
# e.g. Linking against acml, and fupermod is configured under acml_config<br />
-host localhost -np 47 $HOME/fupermod/acml_config/tools/builder -l $HOME/fupermod/acml_config/routines/mxm/.libs/libmxm_1d.so -o k=640 -U10000 -s10<br />
<br />
* Data partitioning<br />
<br />
- Matrix size D = N x N, and machinefile lists the nodes participating in the computing<br />
<br />
$ fupermod/tools/partitioner -l fupermod/routines/mxm/.libs/libmxm_1d.so -D10000 -o N=100 -m machinefile<br />
<br />
* Running matrix multiplication<br />
<br />
$ mpirun -rf rankfile -app appfile_mxm<br />
<br />
- Example of an appfile for matrix multiplication (appfile_mxm)<br />
# GPU<br />
# Assuming fupermod is configured under cublas_config, linking against cublas<br />
# -g0 means device 0 is selected for computing<br />
-host localhost -np 1 $HOME/fupermod/cublas_config/routines/mxm/mxm_2d -k640 -g0 -m machinefile<br />
# CPU<br />
# Assuming fupermod is configured under acml_config, linking against acml<br />
-host localhost -np 47 $HOME/fupermod/acml_config/routines/mxm/mxm_2d -k640 -m machinefile</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=UTK_multicores_%2B_GPU&diff=757UTK multicores + GPU2012-07-16T10:07:28Z<p>Zhongziming: </p>
<hr />
<div>== List of machines ==<br />
http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180<br />
<br />
== Display a list of available GPUs ==<br />
$ nvidia-smi -L<br />
<br />
== Using Fupermod on hybrid multicore/GPUs node ==<br />
* Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl) and GPU cblas (e.g. cublas). <br />
<br />
- For example: configuring with [http://developer.amd.com/libraries/acml/pages/default.aspx acml] for CPU and [http://developer.nvidia.com/cublas cublas] for GPU<br />
<br />
$ cd fupermod/<br />
<br />
$ mkdir acml_config <br />
$ cd acml_config<br />
$ ../configure --with-cblas=acml<br />
$ make<br />
<br />
$ mkdir cuda_config <br />
$ cd cuda_config<br />
$ ../configure --with-cblas=cuda<br />
$ make<br />
<br />
* Building performance model:<br />
<br />
- Rankfile is for [http://www.open-mpi.org/doc/v1.6/man1/mpirun.1.php#toc8 process binding], and appfile tells mpirun what programs to launch<br />
$ mpirun -rf rankfile -app appfile_fpm<br />
<br />
- Example of a rankfile:<br />
rank 0=ig.icl.utk.edu slot=0:0<br />
rank 1=ig.icl.utk.edu slot=0:1<br />
...<br />
<br />
- Example of an appfile for building functional permanence model (appfile_fpm):<br />
# GPU<br />
# e.g. Linking against cublas, and fupermod is configured under cublas_config<br />
# suboption g=0 means device 0 is selected for computing<br />
-host localhost -np 1 $HOME/fupermod/cublas_config/tools/builder -l $HOME/fupermod/cublas_config/routines/mxm/.libs/libmxm_col.so -o k=640,g=0 -U10000 -s10<br />
# CPU<br />
# e.g. Linking against acml, and fupermod is configured under acml_config<br />
-host localhost -np 47 $HOME/fupermod/acml_config/tools/builder -l $HOME/fupermod/acml_config/routines/mxm/.libs/libmxm_col.so -o k=640 -U10000 -s10<br />
<br />
* Data partitioning<br />
<br />
- Matrix size D = N x N, and machinefile lists the nodes participating in the computing<br />
<br />
$ fupermod/tools/partitioner -l fupermod/routines/mxm/.libs/libmxm_col.so -D10000 -o N=100 -m machinefile<br />
<br />
* Running matrix multiplication<br />
<br />
$ mpirun -rf rankfile -app appfile_mxm<br />
<br />
- Example of an appfile for matrix multiplication (appfile_mxm)<br />
# GPU<br />
# Assuming fupermod is configured under cublas_config, linking against cublas<br />
# -g0 means device 0 is selected for computing<br />
-host localhost -np 1 $HOME/fupermod/cublas_config/routines/mxm/mxm_col -k640 -g0 -m machinefile<br />
# CPU<br />
# Assuming fupermod is configured under acml_config, linking against acml<br />
-host localhost -np 47 $HOME/fupermod/acml_config/routines/mxm/mxm_col -k640 -m machinefile</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=UTK_multicores_%2B_GPU&diff=756UTK multicores + GPU2012-07-12T12:57:07Z<p>Zhongziming: </p>
<hr />
<div>== List of machines ==<br />
http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180<br />
<br />
== Display a list of available GPUs ==<br />
$ nvidia-smi -L<br />
<br />
== Using Fupermod on hybrid multicore/GPUs node ==<br />
* Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl) and GPU cblas (e.g. cublas). <br />
<br />
- For example: configuring with [http://developer.amd.com/libraries/acml/pages/default.aspx acml] for CPU and [http://developer.nvidia.com/cublas cublas] for GPU<br />
<br />
$ cd fupermod/<br />
<br />
$ mkdir acml_config <br />
$ cd acml_config<br />
$ ../configure --with-cblas=acml<br />
$ make<br />
<br />
$ mkdir cuda_config <br />
$ cd cuda_config<br />
$ ../configure --with-cblas=cuda<br />
$ make<br />
<br />
* Building performance model:<br />
<br />
- Rankfile is for [http://www.open-mpi.org/doc/v1.6/man1/mpirun.1.php#toc8 process binding], and appfile tells mpi what programs to launch<br />
$ mpirun -rf rankfile -app appfile_fpm<br />
<br />
- Example of a rankfile:<br />
rank 0=ig.icl.utk.edu slot=0:0<br />
rank 1=ig.icl.utk.edu slot=0:1<br />
...<br />
<br />
- Example of an appfile for building functional permanence model (appfile_fpm):<br />
# GPU<br />
# e.g. Linking against cublas, and fupermod is configured under cublas_config<br />
# suboption g=0 means device 0 is selected for computing<br />
-host localhost -np 1 $HOME/fupermod/cublas_config/tools/builder -l $HOME/fupermod/cublas_config/routines/mxm/.libs/libmxm_col.so -o k=640,g=0 -U10000 -s10<br />
# CPU<br />
# e.g. Linking against acml, and fupermod is configured under acml_config<br />
-host localhost -np 47 $HOME/fupermod/acml_config/tools/builder -l $HOME/fupermod/acml_config/routines/mxm/.libs/libmxm_col.so -o k=640 -U10000 -s10<br />
<br />
* Data partitioning<br />
<br />
- Matrix size D = N x N, and machinefile lists the nodes participating in the computing<br />
<br />
$ fupermod/tools/partitioner -l fupermod/routines/mxm/.libs/libmxm_col.so -D10000 -o N=100 -m machinefile<br />
<br />
* Running matrix multiplication<br />
<br />
$ mpirun -rf rankfile -app appfile_mxm<br />
<br />
- Example of an appfile for matrix multiplication (appfile_mxm)<br />
# GPU<br />
# Assuming fupermod is configured under cublas_config, linking against cublas<br />
# -g0 means device 0 is selected for computing<br />
-host localhost -np 1 $HOME/fupermod/cublas_config/routines/mxm/mxm_col -k640 -g0 -m machinefile<br />
# CPU<br />
# Assuming fupermod is configured under acml_config, linking against acml<br />
-host localhost -np 47 $HOME/fupermod/acml_config/routines/mxm/mxm_col -k640 -m machinefile</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=UTK_multicores_%2B_GPU&diff=755UTK multicores + GPU2012-07-12T11:11:53Z<p>Zhongziming: </p>
<hr />
<div>== List of machines ==<br />
http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180<br />
<br />
== Display a list of available GPUs ==<br />
$ nvidia-smi -L<br />
<br />
== Using Fupermod on hybrid multicore/GPUs node ==<br />
* Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl) and GPU cblas (e.g. cublas). <br />
<br />
- For example: configuring with [http://developer.amd.com/libraries/acml/pages/default.aspx acml] for CPU and [http://developer.nvidia.com/cublas cublas] for GPU<br />
<br />
$ cd fupermod/<br />
<br />
$ mkdir acml_config <br />
$ cd acml_config<br />
$ ../configure --with-cblas=acml<br />
$ make<br />
<br />
$ mkdir cuda_config <br />
$ cd cuda_config<br />
$ ../configure --with-cblas=cuda<br />
$ make<br />
<br />
* Building performance model:<br />
<br />
- Rankfile is for processing binding, and appfile tells mpi what programs to launch<br />
$ mpirun -rf rankfile -app appfile_fpm<br />
<br />
- An example of a rankfile:<br />
rank 0=ig.icl.utk.edu slot=0:0<br />
rank 1=ig.icl.utk.edu slot=0:1<br />
...<br />
<br />
- An example of an appfile for building functional permanence model (appfile_fpm):<br />
# GPU<br />
# e.g. Linking against cublas, and fupermod is configured under cublas_config<br />
# suboption g=0 means device 0 is selected for computing<br />
-host localhost -np 1 $HOME/fupermod/cublas_config/tools/builder -l $HOME/fupermod/cublas_config/routines/mxm/.libs/libmxm_col.so -o k=640,g=0 -U10000 -s10<br />
# CPU<br />
# e.g. Linking against acml, and fupermod is configured under acml_config<br />
-host localhost -np 47 $HOME/fupermod/acml_config/tools/builder -l $HOME/fupermod/acml_config/routines/mxm/.libs/libmxm_col.so -o k=640 -U10000 -s10<br />
<br />
* Data partitioning<br />
<br />
- Matrix size D = N x N, and machinefile lists the nodes participating in the computing<br />
<br />
$ fupermod/tools/partitioner -l fupermod/routines/mxm/.libs/libmxm_col.so -D10000 -o N=100 -m machinefile<br />
<br />
* Running matrix multiplication<br />
<br />
$ mpirun -rf rankfile -app appfile_mxm<br />
<br />
- An example of an appfile for matrix multiplication (appfile_mxm)<br />
# GPU<br />
# Assuming fupermod is configured under cublas_config, linking against cublas<br />
# -g0 means device 0 is selected for computing<br />
-host localhost -np 1 $HOME/fupermod/cublas_config/routines/mxm/mxm_col -k640 -g0 -m machinefile<br />
# CPU<br />
# Assuming fupermod is configured under acml_config, linking against acml<br />
-host localhost -np 47 $HOME/fupermod/acml_config/routines/mxm/mxm_col -k640 -m machinefile</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=UTK_multicores_%2B_GPU&diff=754UTK multicores + GPU2012-07-12T11:11:01Z<p>Zhongziming: </p>
<hr />
<div>== List of machines ==<br />
http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180<br />
<br />
== Display a list of available GPUs ==<br />
$ nvidia-smi -L<br />
<br />
== Using Fupermod on hybrid multicore/GPUs node ==<br />
* Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl) and GPU cblas (e.g. cublas). <br />
<br />
- For example: configuring with [http://developer.amd.com/libraries/acml/pages/default.aspx acml] for CPU and [http://developer.nvidia.com/cublas cublas] for GPU<br />
<br />
cd fupermod/<br />
<br />
mkdir acml_config <br />
cd acml_config<br />
../configure --with-cblas=acml<br />
make<br />
<br />
mkdir cuda_config <br />
cd cuda_config<br />
../configure --with-cblas=cuda<br />
make<br />
<br />
* Building performance model:<br />
<br />
- Rankfile is for processing binding, and appfile tells mpi what programs to launch<br />
$ mpirun -rf rankfile -app appfile_fpm<br />
<br />
- An example of a rankfile:<br />
rank 0=ig.icl.utk.edu slot=0:0<br />
rank 1=ig.icl.utk.edu slot=0:1<br />
...<br />
<br />
- An example of an appfile for building functional permanence model (appfile_fpm):<br />
# GPU<br />
# e.g. Linking against cublas, and fupermod is configured under cublas_config<br />
# suboption g=0 means device 0 is selected for computing<br />
-host localhost -np 1 $HOME/fupermod/cublas_config/tools/builder -l $HOME/fupermod/cublas_config/routines/mxm/.libs/libmxm_col.so -o k=640,g=0 -U10000 -s10<br />
# CPU<br />
# e.g. Linking against acml, and fupermod is configured under acml_config<br />
-host localhost -np 47 $HOME/fupermod/acml_config/tools/builder -l $HOME/fupermod/acml_config/routines/mxm/.libs/libmxm_col.so -o k=640 -U10000 -s10<br />
<br />
* Data partitioning<br />
<br />
- Matrix size D = N x N, and machinefile lists the nodes participating in the computing<br />
<br />
$ fupermod/tools/partitioner -l fupermod/routines/mxm/.libs/libmxm_col.so -D10000 -o N=100 -m machinefile<br />
<br />
* Running matrix multiplication<br />
<br />
$ mpirun -rf rankfile -app appfile_mxm<br />
<br />
- An example of an appfile for matrix multiplication (appfile_mxm)<br />
# GPU<br />
# Assuming fupermod is configured under cublas_config, linking against cublas<br />
# -g0 means device 0 is selected for computing<br />
-host localhost -np 1 $HOME/fupermod/cublas_config/routines/mxm/mxm_col -k640 -g0 -m machinefile<br />
# CPU<br />
# Assuming fupermod is configured under acml_config, linking against acml<br />
-host localhost -np 47 $HOME/fupermod/acml_config/routines/mxm/mxm_col -k640 -m machinefile</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=UTK_multicores_%2B_GPU&diff=753UTK multicores + GPU2012-07-12T11:10:45Z<p>Zhongziming: </p>
<hr />
<div>== List of machines ==<br />
http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180<br />
<br />
== Display a list of available GPUs ==<br />
$ nvidia-smi -L<br />
<br />
== Using Fupermod on hybrid multicore/GPUs node ==<br />
* Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl) and GPU cblas (e.g. cublas). <br />
<br />
- For example: configuring with [http://developer.amd.com/libraries/acml/pages/default.aspx acml] for CPU and [http://developer.nvidia.com/cublas] for GPU<br />
<br />
cd fupermod/<br />
<br />
mkdir acml_config <br />
cd acml_config<br />
../configure --with-cblas=acml<br />
make<br />
<br />
mkdir cuda_config <br />
cd cuda_config<br />
../configure --with-cblas=cuda<br />
make<br />
<br />
* Building performance model:<br />
<br />
- Rankfile is for processing binding, and appfile tells mpi what programs to launch<br />
$ mpirun -rf rankfile -app appfile_fpm<br />
<br />
- An example of a rankfile:<br />
rank 0=ig.icl.utk.edu slot=0:0<br />
rank 1=ig.icl.utk.edu slot=0:1<br />
...<br />
<br />
- An example of an appfile for building functional permanence model (appfile_fpm):<br />
# GPU<br />
# e.g. Linking against cublas, and fupermod is configured under cublas_config<br />
# suboption g=0 means device 0 is selected for computing<br />
-host localhost -np 1 $HOME/fupermod/cublas_config/tools/builder -l $HOME/fupermod/cublas_config/routines/mxm/.libs/libmxm_col.so -o k=640,g=0 -U10000 -s10<br />
# CPU<br />
# e.g. Linking against acml, and fupermod is configured under acml_config<br />
-host localhost -np 47 $HOME/fupermod/acml_config/tools/builder -l $HOME/fupermod/acml_config/routines/mxm/.libs/libmxm_col.so -o k=640 -U10000 -s10<br />
<br />
* Data partitioning<br />
<br />
- Matrix size D = N x N, and machinefile lists the nodes participating in the computing<br />
<br />
$ fupermod/tools/partitioner -l fupermod/routines/mxm/.libs/libmxm_col.so -D10000 -o N=100 -m machinefile<br />
<br />
* Running matrix multiplication<br />
<br />
$ mpirun -rf rankfile -app appfile_mxm<br />
<br />
- An example of an appfile for matrix multiplication (appfile_mxm)<br />
# GPU<br />
# Assuming fupermod is configured under cublas_config, linking against cublas<br />
# -g0 means device 0 is selected for computing<br />
-host localhost -np 1 $HOME/fupermod/cublas_config/routines/mxm/mxm_col -k640 -g0 -m machinefile<br />
# CPU<br />
# Assuming fupermod is configured under acml_config, linking against acml<br />
-host localhost -np 47 $HOME/fupermod/acml_config/routines/mxm/mxm_col -k640 -m machinefile</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=UTK_multicores_%2B_GPU&diff=752UTK multicores + GPU2012-07-12T11:09:29Z<p>Zhongziming: </p>
<hr />
<div>== List of machines ==<br />
http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180<br />
<br />
== Display a list of available GPUs ==<br />
$ nvidia-smi -L<br />
<br />
== Using Fupermod on hybrid multicore/GPUs node ==<br />
* Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl) and GPU cblas (e.g. cublas). <br />
<br />
- For example: configuring with acml for CPU ([http://developer.amd.com/libraries/acml/pages/default.aspx acml]) and cublas for GPU (), and then make<br />
<br />
cd fupermod/<br />
<br />
mkdir acml_config <br />
cd acml_config<br />
./configure --with-cblas=acml<br />
make<br />
<br />
mkdir cuda_config <br />
cd cuda_config<br />
./configure --with-cblas=cuda<br />
make<br />
<br />
* Building performance model:<br />
<br />
- Rankfile is for processing binding, and appfile tells mpi what programs to launch<br />
$ mpirun -rf rankfile -app appfile_fpm<br />
<br />
- An example of a rankfile:<br />
rank 0=ig.icl.utk.edu slot=0:0<br />
rank 1=ig.icl.utk.edu slot=0:1<br />
...<br />
<br />
- An example of an appfile for building functional permanence model (appfile_fpm):<br />
# GPU<br />
# e.g. Linking against cublas, and fupermod is configured under cublas_config<br />
# suboption g=0 means device 0 is selected for computing<br />
-host localhost -np 1 $HOME/fupermod/cublas_config/tools/builder -l $HOME/fupermod/cublas_config/routines/mxm/.libs/libmxm_col.so -o k=640,g=0 -U10000 -s10<br />
# CPU<br />
# e.g. Linking against acml, and fupermod is configured under acml_config<br />
-host localhost -np 47 $HOME/fupermod/acml_config/tools/builder -l $HOME/fupermod/acml_config/routines/mxm/.libs/libmxm_col.so -o k=640 -U10000 -s10<br />
<br />
* Data partitioning<br />
<br />
- Matrix size D = N x N, and machinefile lists the nodes participating in the computing<br />
<br />
$ fupermod/tools/partitioner -l fupermod/routines/mxm/.libs/libmxm_col.so -D10000 -o N=100 -m machinefile<br />
<br />
* Running matrix multiplication<br />
<br />
$ mpirun -rf rankfile -app appfile_mxm<br />
<br />
- An example of an appfile for matrix multiplication (appfile_mxm)<br />
# GPU<br />
# Assuming fupermod is configured under cublas_config, linking against cublas<br />
# -g0 means device 0 is selected for computing<br />
-host localhost -np 1 $HOME/fupermod/cublas_config/routines/mxm/mxm_col -k640 -g0 -m machinefile<br />
# CPU<br />
# Assuming fupermod is configured under acml_config, linking against acml<br />
-host localhost -np 47 $HOME/fupermod/acml_config/routines/mxm/mxm_col -k640 -m machinefile</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=UTK_multicores_%2B_GPU&diff=751UTK multicores + GPU2012-07-12T11:08:06Z<p>Zhongziming: </p>
<hr />
<div>== List of machines ==<br />
http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180<br />
<br />
== Display a list of available GPUs ==<br />
$ nvidia-smi -L<br />
<br />
== Using Fupermod on hybrid multicore/GPUs node ==<br />
* Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl) and GPU cblas (e.g. cublas). <br />
<br />
- For example: configuring with acml blas for CPU and cublas for GPU, and then make<br />
<br />
cd fupermod/<br />
<br />
mkdir acml_config <br />
cd acml_config<br />
./configure --with-cblas=acml<br />
make<br />
<br />
mkdir cuda_config <br />
cd cuda_config<br />
./configure --with-cblas=cuda<br />
make<br />
<br />
* Building performance model:<br />
<br />
- Rankfile is for processing binding, and appfile tells mpi what programs to launch<br />
$ mpirun -rf rankfile -app appfile_fpm<br />
<br />
- An example of a rankfile:<br />
rank 0=ig.icl.utk.edu slot=0:0<br />
rank 1=ig.icl.utk.edu slot=0:1<br />
...<br />
<br />
- An example of an appfile for building functional permanence model (appfile_fpm):<br />
# GPU<br />
# e.g. Linking against cublas, and fupermod is configured under cublas_config<br />
# suboption g=0 means device 0 is selected for computing<br />
-host localhost -np 1 $HOME/fupermod/cublas_config/tools/builder -l $HOME/fupermod/cublas_config/routines/mxm/.libs/libmxm_col.so -o k=640,g=0 -U10000 -s10<br />
# CPU<br />
# e.g. Linking against acml, and fupermod is configured under acml_config<br />
-host localhost -np 47 $HOME/fupermod/acml_config/tools/builder -l $HOME/fupermod/acml_config/routines/mxm/.libs/libmxm_col.so -o k=640 -U10000 -s10<br />
<br />
* Data partitioning<br />
<br />
- Matrix size D = N x N, and machinefile lists the nodes participating in the computing<br />
<br />
$ fupermod/tools/partitioner -l fupermod/routines/mxm/.libs/libmxm_col.so -D10000 -o N=100 -m machinefile<br />
<br />
* Running matrix multiplication<br />
<br />
$ mpirun -rf rankfile -app appfile_mxm<br />
<br />
- An example of an appfile for matrix multiplication (appfile_mxm)<br />
# GPU<br />
# Assuming fupermod is configured under cublas_config, linking against cublas<br />
# -g0 means device 0 is selected for computing<br />
-host localhost -np 1 $HOME/fupermod/cublas_config/routines/mxm/mxm_col -k640 -g0 -m machinefile<br />
# CPU<br />
# Assuming fupermod is configured under acml_config, linking against acml<br />
-host localhost -np 47 $HOME/fupermod/acml_config/routines/mxm/mxm_col -k640 -m machinefile</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=UTK_multicores_%2B_GPU&diff=750UTK multicores + GPU2012-07-12T11:06:33Z<p>Zhongziming: </p>
<hr />
<div>== List of machines ==<br />
http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180<br />
<br />
== Display a list of available GPUs ==<br />
$ nvidia-smi -L<br />
<br />
== Using Fupermod on hybrid multicore/GPUs node ==<br />
* Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl) and GPU cblas (e.g. cublas). <br />
<br />
- For example: Using acml blas for CPU and cublas for GPU computing<br />
<br />
cd fupermod/<br />
<br />
mkdir acml_config <br />
cd acml_config<br />
./configure --with-cblas=acml<br />
<br />
mkdir cuda_config <br />
cd cuda_config<br />
./configure --with-cblas=cuda<br />
<br />
* Building performance model:<br />
<br />
- Rankfile is for processing binding, and appfile tells mpi what programs to launch<br />
$ mpirun -rf rankfile -app appfile_fpm<br />
<br />
- An example of a rankfile:<br />
rank 0=ig.icl.utk.edu slot=0:0<br />
rank 1=ig.icl.utk.edu slot=0:1<br />
...<br />
<br />
- An example of an appfile for building functional permanence model (appfile_fpm):<br />
# GPU<br />
# e.g. Linking against cublas, and fupermod is configured under cublas_config<br />
# suboption g=0 means device 0 is selected for computing<br />
-host localhost -np 1 $HOME/fupermod/cublas_config/tools/builder -l $HOME/fupermod/cublas_config/routines/mxm/.libs/libmxm_col.so -o k=640,g=0 -U10000 -s10<br />
# CPU<br />
# e.g. Linking against acml, and fupermod is configured under acml_config<br />
-host localhost -np 47 $HOME/fupermod/acml_config/tools/builder -l $HOME/fupermod/acml_config/routines/mxm/.libs/libmxm_col.so -o k=640 -U10000 -s10<br />
<br />
* Data partitioning<br />
<br />
- Matrix size D = N x N, and machinefile lists the nodes participating in the computing<br />
<br />
$ fupermod/tools/partitioner -l fupermod/routines/mxm/.libs/libmxm_col.so -D10000 -o N=100 -m machinefile<br />
<br />
* Running matrix multiplication<br />
<br />
$ mpirun -rf rankfile -app appfile_mxm<br />
<br />
- An example of an appfile for matrix multiplication (appfile_mxm)<br />
# GPU<br />
# Assuming fupermod is configured under cublas_config, linking against cublas<br />
# -g0 means device 0 is selected for computing<br />
-host localhost -np 1 $HOME/fupermod/cublas_config/routines/mxm/mxm_col -k640 -g0 -m machinefile<br />
# CPU<br />
# Assuming fupermod is configured under acml_config, linking against acml<br />
-host localhost -np 47 $HOME/fupermod/acml_config/routines/mxm/mxm_col -k640 -m machinefile</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=OpenMPI&diff=749OpenMPI2012-07-12T11:05:59Z<p>Zhongziming: </p>
<hr />
<div>http://www.open-mpi.org/faq/<br />
<br />
== MCA parameter files ==<br />
If you want to permanently use some MCA parameter settings, you can create a file $HOME/.openmpi/mca-params.conf, e.g.:<br />
<br />
cat $HOME/.openmpi/mca-params.conf<br />
btl_tcp_if_exclude = lo,eth1<br />
<br />
== Running applications on Multiprocessors/Multicores ==<br />
Process can be bound to specific sockets and cores on nodes by choosing right options of mpirun.<br />
* [http://www.open-mpi.org/doc/v1.4/man1/mpirun.1.php#sect9 Process binding]<br />
* [http://www.open-mpi.org/doc/v1.4/man1/mpirun.1.php#sect10 Rankfile]<br />
<br />
== PERUSE ==<br />
[[Media:current_peruse_spec.pdf|PERUSE Specification]]</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=CUDA_SDK&diff=748CUDA SDK2012-07-12T11:04:57Z<p>Zhongziming: Created page with "http://developer.nvidia.com/gpu-computing-sdk"</p>
<hr />
<div>http://developer.nvidia.com/gpu-computing-sdk</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=BLAS_LAPACK_ScaLAPACK&diff=747BLAS LAPACK ScaLAPACK2012-07-12T11:04:10Z<p>Zhongziming: </p>
<hr />
<div>A de facto standard API for linear algebra [http://en.wikipedia.org/wiki/Basic_Linear_Algebra_Subprograms BLAS]/[http://en.wikipedia.org/wiki/LAPACK LAPACK]<br />
* Original http://www.netlib.org/blas/ http://www.netlib.org/lapack/ - implemented in Fortran. The libraries can be used in C/C++ (so called Fortran interface to BLAS/LAPACK).<br />
* ATLAS http://math-atlas.sourceforge.net/ - provides a C interface to BLAS and partially LAPACK. Binary packages: libatlas-[base or platform name, for example sse2]<br />
* MKL http://software.intel.com/en-us/intel-mkl/ - Intel implementation<br />
*ACML http://developer.amd.com/libraries/acml/pages/default.aspx<br />
*CUBLAS http://developer.nvidia.com/cublas<br />
<br />
Using the C interface is preferable. [http://www.inf.bv.tum.de/~heisserer/softwarelab04/doc/blas_report.pdf BLAS: overview, installation, usage]<br />
<br />
= ScaLAPACK =<br />
http://www.netlib.org/scalapack/</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=BLAS_LAPACK_ScaLAPACK&diff=746BLAS LAPACK ScaLAPACK2012-07-12T11:03:28Z<p>Zhongziming: </p>
<hr />
<div>A de facto standard API for linear algebra [http://en.wikipedia.org/wiki/Basic_Linear_Algebra_Subprograms BLAS]/[http://en.wikipedia.org/wiki/LAPACK LAPACK]<br />
* Original http://www.netlib.org/blas/ http://www.netlib.org/lapack/ - implemented in Fortran. The libraries can be used in C/C++ (so called Fortran interface to BLAS/LAPACK).<br />
* ATLAS http://math-atlas.sourceforge.net/ - provides a C interface to BLAS and partially LAPACK. Binary packages: libatlas-[base or platform name, for example sse2]<br />
* MKL http://software.intel.com/en-us/intel-mkl/ - Intel implementation<br />
*ACML http://developer.amd.com/libraries/acml/pages/default.aspx<br />
<br />
Using the C interface is preferable. [http://www.inf.bv.tum.de/~heisserer/softwarelab04/doc/blas_report.pdf BLAS: overview, installation, usage]<br />
<br />
= ScaLAPACK =<br />
http://www.netlib.org/scalapack/</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=UTK_multicores_%2B_GPU&diff=745UTK multicores + GPU2012-07-12T11:02:15Z<p>Zhongziming: </p>
<hr />
<div>== List of machines ==<br />
http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180<br />
<br />
== Display a list of available GPUs ==<br />
$ nvidia-smi -L<br />
<br />
== Using Fupermod on hybrid multicore/GPUs node ==<br />
* Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). <br />
<br />
- For example: Using acml blas for CPU and cublas for GPU computing<br />
<br />
cd fupermod/<br />
<br />
mkdir acml_config <br />
cd acml_config<br />
./configure --with-cblas=acml<br />
<br />
mkdir cuda_config <br />
cd cuda_config<br />
./configure --with-cblas=cuda<br />
<br />
* Building performance model:<br />
<br />
- Rankfile is for processing binding, and appfile tells mpi what programs to launch<br />
$ mpirun -rf rankfile -app appfile_fpm<br />
<br />
- An example of a rankfile:<br />
rank 0=ig.icl.utk.edu slot=0:0<br />
rank 1=ig.icl.utk.edu slot=0:1<br />
...<br />
<br />
- An example of an appfile for building functional permanence model (appfile_fpm):<br />
# GPU<br />
# e.g. Linking against cublas, and fupermod is configured under cublas_config<br />
# suboption g=0 means device 0 is selected for computing<br />
-host localhost -np 1 $HOME/fupermod/cublas_config/tools/builder -l $HOME/fupermod/cublas_config/routines/mxm/.libs/libmxm_col.so -o k=640,g=0 -U10000 -s10<br />
# CPU<br />
# e.g. Linking against acml, and fupermod is configured under acml_config<br />
-host localhost -np 47 $HOME/fupermod/acml_config/tools/builder -l $HOME/fupermod/acml_config/routines/mxm/.libs/libmxm_col.so -o k=640 -U10000 -s10<br />
<br />
* Data partitioning<br />
<br />
- Matrix size D = N x N, and machinefile lists the nodes participating in the computing<br />
<br />
$ fupermod/tools/partitioner -l fupermod/routines/mxm/.libs/libmxm_col.so -D10000 -o N=100 -m machinefile<br />
<br />
* Running matrix multiplication<br />
<br />
$ mpirun -rf rankfile -app appfile_mxm<br />
<br />
- An example of an appfile for matrix multiplication (appfile_mxm)<br />
# GPU<br />
# Assuming fupermod is configured under cublas_config, linking against cublas<br />
# -g0 means device 0 is selected for computing<br />
-host localhost -np 1 $HOME/fupermod/cublas_config/routines/mxm/mxm_col -k640 -g0 -m machinefile<br />
# CPU<br />
# Assuming fupermod is configured under acml_config, linking against acml<br />
-host localhost -np 47 $HOME/fupermod/acml_config/routines/mxm/mxm_col -k640 -m machinefile</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=UTK_multicores_%2B_GPU&diff=744UTK multicores + GPU2012-07-12T11:01:21Z<p>Zhongziming: </p>
<hr />
<div>== List of machines ==<br />
http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180<br />
<br />
== Display a list of available GPUs ==<br />
$ nvidia-smi -L<br />
<br />
== Using Fupermod on hybrid multicore/GPUs node ==<br />
* Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). <br />
<br />
- For example: Using acml blas for CPU and cublas for GPU computing<br />
<br />
cd fupermod/<br />
<br />
mkdir acml_config <br />
cd acml_config<br />
./configure --with-cblas=acml<br />
<br />
mkdir cuda_config <br />
cd cuda_config<br />
./configure --with-cblas=cuda<br />
<br />
* Building performance model:<br />
<br />
- Rankfile is for processing binding, and appfile tells mpi what programs to launch<br />
$ mpirun -rf rankfile -app appfile_fpm<br />
<br />
An example of a rankfile:<br />
rank 0=ig.icl.utk.edu slot=0:0<br />
rank 1=ig.icl.utk.edu slot=0:1<br />
...<br />
<br />
- An example of an appfile for building functional permanence model (appfile_fpm):<br />
# GPU<br />
# e.g. Linking against cublas, and fupermod is configured under cublas_config<br />
# suboption g=0 means device 0 is selected for computing<br />
-host localhost -np 1 $HOME/fupermod/cublas_config/tools/builder -l $HOME/fupermod/cublas_config/routines/mxm/.libs/libmxm_col.so -o k=640,g=0 -U10000 -s10<br />
# CPU<br />
# e.g. Linking against acml, and fupermod is configured under acml_config<br />
-host localhost -np 47 $HOME/fupermod/acml_config/tools/builder -l $HOME/fupermod/acml_config/routines/mxm/.libs/libmxm_col.so -o k=640 -U10000 -s10<br />
<br />
* Data partitioning<br />
<br />
- Matrix size D = N x N, and machinefile lists the nodes participating in the computing<br />
<br />
$ fupermod/tools/partitioner -l fupermod/routines/mxm/.libs/libmxm_col.so -D10000 -o N=100 -m machinefile<br />
<br />
* Running matrix multiplication<br />
<br />
$ mpirun -rf rankfile -app appfile_mxm<br />
<br />
- An example of an appfile for matrix multiplication (appfile_mxm)<br />
# GPU<br />
# Assuming fupermod is configured under cublas_config, linking against cublas<br />
# -g0 means device 0 is selected for computing<br />
-host localhost -np 1 $HOME/fupermod/cublas_config/routines/mxm/mxm_col -k640 -g0 -m machinefile<br />
# CPU<br />
# Assuming fupermod is configured under acml_config, linking against acml<br />
-host localhost -np 47 $HOME/fupermod/acml_config/routines/mxm/mxm_col -k640 -m machinefile</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=UTK_multicores_%2B_GPU&diff=743UTK multicores + GPU2012-07-12T11:00:46Z<p>Zhongziming: </p>
<hr />
<div>== List of machines ==<br />
http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180<br />
<br />
== Display a list of available GPUs ==<br />
$ nvidia-smi -L<br />
<br />
== Using Fupermod on hybrid multicore/GPUs node ==<br />
* Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). <br />
<br />
-For example: Using acml blas for CPU and cublas for GPU computing<br />
<br />
cd fupermod/<br />
<br />
mkdir acml_config <br />
cd acml_config<br />
./configure --with-cblas=acml<br />
<br />
mkdir cuda_config <br />
cd cuda_config<br />
./configure --with-cblas=cuda<br />
<br />
* Building performance model:<br />
<br />
-Rankfile is for processing binding, and appfile tells mpi what programs to launch<br />
$ mpirun -rf rankfile -app appfile_fpm<br />
<br />
An example of a rankfile:<br />
rank 0=ig.icl.utk.edu slot=0:0<br />
rank 1=ig.icl.utk.edu slot=0:1<br />
...<br />
<br />
-An example of an appfile for building functional permanence model (appfile_fpm):<br />
# GPU<br />
# e.g. Linking against cublas, and fupermod is configured under cublas_config<br />
# suboption g=0 means device 0 is selected for computing<br />
-host localhost -np 1 $HOME/fupermod/cublas_config/tools/builder -l $HOME/fupermod/cublas_config/routines/mxm/.libs/libmxm_col.so -o k=640,g=0 -U10000 -s10<br />
# CPU<br />
# e.g. Linking against acml, and fupermod is configured under acml_config<br />
-host localhost -np 47 $HOME/fupermod/acml_config/tools/builder -l $HOME/fupermod/acml_config/routines/mxm/.libs/libmxm_col.so -o k=640 -U10000 -s10<br />
<br />
* Data partitioning<br />
<br />
-Matrix size D = N x N, and machinefile lists the nodes participating in the computing<br />
<br />
$ fupermod/tools/partitioner -l fupermod/routines/mxm/.libs/libmxm_col.so -D10000 -o N=100 -m machinefile<br />
<br />
* Running matrix multiplication<br />
<br />
$ mpirun -rf rankfile -app appfile_mxm<br />
<br />
-An example of an appfile for matrix multiplication (appfile_mxm)<br />
# GPU<br />
# Assuming fupermod is configured under cublas_config, linking against cublas<br />
# -g0 means device 0 is selected for computing<br />
-host localhost -np 1 $HOME/fupermod/cublas_config/routines/mxm/mxm_col -k640 -g0 -m machinefile<br />
# CPU<br />
# Assuming fupermod is configured under acml_config, linking against acml<br />
-host localhost -np 47 $HOME/fupermod/acml_config/routines/mxm/mxm_col -k640 -m machinefile</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=UTK_multicores_%2B_GPU&diff=742UTK multicores + GPU2012-07-12T10:58:56Z<p>Zhongziming: </p>
<hr />
<div>== List of machines ==<br />
http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180<br />
<br />
== Display a list of available GPUs ==<br />
$ nvidia-smi -L<br />
<br />
== Using Fupermod on hybrid multicore/GPUs node ==<br />
* Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). <br />
<br />
For example: Using acml blas for CPU and cublas for GPU computing<br />
<br />
cd fupermod/<br />
<br />
mkdir acml_config <br />
cd acml_config<br />
./configure --with-cblas=acml<br />
<br />
mkdir cuda_config <br />
cd cuda_config<br />
./configure --with-cblas=cuda<br />
<br />
* Building performance model:<br />
<br />
rankfile is for processing binding, and appfile tells mpi what programs to launch<br />
$ mpirun -rf rankfile -app appfile_fpm<br />
<br />
An example of a rankfile:<br />
rank 0=ig.icl.utk.edu slot=0:0<br />
rank 1=ig.icl.utk.edu slot=0:1<br />
...<br />
<br />
An example of an appfile for building functional permanence model (FPM):<br />
# GPU<br />
# e.g. Linking against cublas, and fupermod is configured under cublas_config<br />
# suboption g=0 means device 0 is selected for computing<br />
-host localhost -np 1 $HOME/fupermod/cublas_config/tools/builder -l $HOME/fupermod/cublas_config/routines/mxm/.libs/libmxm_col.so -o k=640,g=0 -U10000 -s10<br />
# CPU<br />
# e.g. Linking against acml, and fupermod is configured under acml_config<br />
-host localhost -np 47 $HOME/fupermod/acml_config/tools/builder -l $HOME/fupermod/acml_config/routines/mxm/.libs/libmxm_col.so -o k=640 -U10000 -s10<br />
<br />
* Data partitioning<br />
<br />
matrix size D = N x N, and machinefile lists the nodes participating in the computing<br />
<br />
$ fupermod/tools/partitioner -l fupermod/routines/mxm/.libs/libmxm_col.so -D10000 -o N=100 -m machinefile<br />
<br />
* Running matrix multiplication<br />
<br />
$ mpirun -rf rankfile -app appfile_mxm<br />
<br />
An example of an appfile for matrix multiplication<br />
# GPU<br />
# Assuming fupermod is configured under cublas_config, linking against cublas<br />
# -g0 means device 0 is selected for computing<br />
-host localhost -np 1 $HOME/fupermod/cublas_config/routines/mxm/mxm_col -k640 -g0 -m machinefile<br />
# CPU<br />
# Assuming fupermod is configured under acml_config, linking against acml<br />
-host localhost -np 47 $HOME/fupermod/acml_config/routines/mxm/mxm_col -k640 -m machinefile</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=UTK_multicores_%2B_GPU&diff=741UTK multicores + GPU2012-07-12T10:58:18Z<p>Zhongziming: </p>
<hr />
<div>== List of machines ==<br />
http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180<br />
<br />
== Display a list of available GPUs ==<br />
$ nvidia-smi -L<br />
<br />
== Using Fupermod on hybrid multicore/GPUs node ==<br />
* Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). <br />
<br />
For example: Using acml blas for CPU and cublas for GPU computing<br />
<br />
cd fupermod/<br />
<br />
mkdir acml_config <br />
cd acml_config<br />
./configure --with-cblas=acml<br />
<br />
mkdir cuda_config <br />
cd cuda_config<br />
./configure --with-cblas=cuda<br />
<br />
* Building performance model:<br />
<br />
rankfile is for processing binding, and appfile tells mpi what programs to launch<br />
$ mpirun -rf rankfile -app appfile_fpm<br />
<br />
An example of a rankfile:<br />
rank 0=ig.icl.utk.edu slot=0:0<br />
rank 1=ig.icl.utk.edu slot=0:1<br />
...<br />
<br />
An example of an appfile for building functional permanence model (FPM):<br />
# GPU<br />
# e.g. Linking against cublas, and fupermod is configured under cublas_config<br />
# suboption g=0 means device 0 is selected for computing<br />
-host localhost -np 1 $HOME/fupermod/cublas_config/tools/builder -l $HOME/fupermod/cublas_config/routines/mxm/.libs/libmxm_col.so -o k=640,g=0 -U10000 -s10<br />
# CPU<br />
# e.g. Linking against acml, and fupermod is configured under acml_config<br />
-host localhost -np 47 $HOME/fupermod/acml_config/tools/builder -l $HOME/fupermod/acml_config/routines/mxm/.libs/libmxm_col.so -o k=640 -U10000 -s10<br />
<br />
* Data partitioning<br />
<br />
matrix size D = N x N, and machinefile lists the nodes participating in the computing<br />
<br />
$ fupermod/tools/partitioner -l fupermod/routines/mxm/.libs/libmxm_col.so -D10000 -o N=100 -m machinefile<br />
<br />
* Running matrix multiplication<br />
<br />
$ mpirun -rf rankfile -app appfile_mxm<br />
<br />
An example of an appfile for matrix multiplication<br />
# GPU<br />
# Assuming fupermod is configured under cublas_config, linking against cublas<br />
# -g0 means device 0 is selected for computing<br />
-host localhost -np 1 $HOME/fupermod/cublas_config/routines/mxm/mxm_col -k640 -g0 -m machinefile<br />
# CPU<br />
# Assuming fupermod is configured under acml_config, linking against acml<br />
-host localhost -np 47 $HOME/fupermod/acml_config/routines/mxm/mxm_col -k640 -m machinefile</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=UTK_multicores_%2B_GPU&diff=740UTK multicores + GPU2012-07-12T10:56:08Z<p>Zhongziming: </p>
<hr />
<div>== List of machines ==<br />
http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180<br />
<br />
== Display a list of available GPUs ==<br />
$ nvidia-smi -L<br />
<br />
== Using Fupermod on hybrid multicore/GPUs node ==<br />
* Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). <br />
<br />
** For example: Using acml blas for CPU and cublas for GPU computing<br />
<br />
cd fupermod/<br />
<br />
mkdir acml_config <br />
cd acml_config<br />
./configure --with-cblas=acml<br />
<br />
mkdir cuda_config <br />
cd cuda_config<br />
./configure --with-cblas=cuda<br />
<br />
* Building performance model:<br />
<br />
** rankfile is for processing binding, and appfile tells mpi what programs to launch<br />
$ mpirun -rf rankfile -app appfile_fpm<br />
<br />
** example of a rankfile:<br />
rank 0=ig.icl.utk.edu slot=0:0<br />
rank 1=ig.icl.utk.edu slot=0:1<br />
...<br />
<br />
** example of an appfile for building functional permanence model (FPM):<br />
# GPU<br />
# e.g. Linking against cublas, and fupermod is configured under cublas_config<br />
# suboption g=0 means device 0 is selected for computing<br />
-host localhost -np 1 $HOME/fupermod/cublas_config/tools/builder -l $HOME/fupermod/cublas_config/routines/mxm/.libs/libmxm_col.so -o k=640,g=0 -U10000 -s10<br />
# CPU<br />
# e.g. Linking against acml, and fupermod is configured under acml_config<br />
-host localhost -np 47 $HOME/fupermod/acml_config/tools/builder -l $HOME/fupermod/acml_config/routines/mxm/.libs/libmxm_col.so -o k=640 -U10000 -s10<br />
<br />
* Data partitioning<br />
<br />
** matrix size D = N x N, and machinefile lists the nodes participating in the computing<br />
<br />
$ fupermod/tools/partitioner -l fupermod/routines/mxm/.libs/libmxm_col.so -D10000 -o N=100 -m machinefile<br />
<br />
* Running matrix multiplication<br />
<br />
$ mpirun -rf rankfile -app appfile_mxm<br />
<br />
** example of an appfile for matrix multiplication<br />
# GPU<br />
# Assuming fupermod is configured under cublas_config, linking against cublas<br />
# -g0 means device 0 is selected for computing<br />
-host localhost -np 1 $HOME/fupermod/cublas_config/routines/mxm/mxm_col -k640 -g0 -m machinefile<br />
# CPU<br />
# Assuming fupermod is configured under acml_config, linking against acml<br />
-host localhost -np 47 $HOME/fupermod/acml_config/routines/mxm/mxm_col -k640 -m machinefile</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=UTK_multicores_%2B_GPU&diff=739UTK multicores + GPU2012-07-12T10:54:56Z<p>Zhongziming: </p>
<hr />
<div>== List of machines ==<br />
http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180<br />
<br />
== Display a list of available GPUs ==<br />
$ nvidia-smi -L<br />
<br />
== Using Fupermod on hybrid multicore/GPUs node ==<br />
* Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). <br />
<br />
** For example: Using acml blas for CPU and cublas for GPU computing<br />
<br />
cd fupermod/<br />
<br />
mkdir acml_config <br />
cd acml_config<br />
./configure --with-cblas=acml<br />
<br />
mkdir cuda_config <br />
cd cuda_config<br />
./configure --with-cblas=cuda<br />
<br />
* Building performance model:<br />
<br />
** rankfile is for processing binding, and appfile tells mpi what programs to launch<br />
$ mpirun -rf rankfile -app appfile_fpm<br />
<br />
** example of a rankfile:<br />
rank 0=ig.icl.utk.edu slot=0:0<br />
rank 1=ig.icl.utk.edu slot=0:1<br />
...<br />
<br />
** example of an appfile for building functional permanence model (FPM):<br />
# GPU #<br />
# e.g. Linking against cublas, and fupermod is configured under cublas_config<br />
# suboption g=0 means device 0 is selected for computing<br />
-host localhost -np 1 $HOME/fupermod/cublas_config/tools/builder -l $HOME/fupermod/cublas_config/routines/mxm/.libs/libmxm_col.so -o k=640,g=0 -U10000 -s10<br />
# CPU #<br />
# e.g. Linking against acml, and fupermod is configured under acml_config<br />
-host localhost -np 47 $HOME/fupermod/acml_config/tools/builder -l $HOME/fupermod/acml_config/routines/mxm/.libs/libmxm_col.so -o k=640 -U10000 -s10<br />
<br />
* Data partitioning<br />
<br />
** matrix size D = N x N, N = sqrt(D), and machinefile lists the nodes participating in the computing<br />
<br />
$ fupermod/tools/partitioner -l fupermod/routines/mxm/.libs/libmxm_col.so -D10000 -o N=100 -m machinefile<br />
<br />
* Running matrix multiplication<br />
<br />
$ mpirun -rf rankfile -app appfile_mxm<br />
<br />
** example of an appfile for matrix multiplication<br />
# GPU #<br />
# Assuming fupermod is configured under cublas_config, linking against cublas<br />
# -g0 means device 0 is selected for computing<br />
-host localhost -np 1 $HOME/fupermod/cublas_config/routines/mxm/mxm_col -k640 -g0 -m machinefile<br />
# CPU #<br />
# Assuming fupermod is configured under acml_config, linking against acml<br />
-host localhost -np 47 $HOME/fupermod/acml_config/routines/mxm/mxm_col -k640 -m machinefile</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=UTK_multicores_%2B_GPU&diff=738UTK multicores + GPU2012-07-12T10:54:07Z<p>Zhongziming: </p>
<hr />
<div>== List of machines ==<br />
http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180<br />
<br />
== Display a list of available GPUs ==<br />
$ nvidia-smi -L<br />
<br />
== Using Fupermod on hybrid multicore/GPUs node ==<br />
* Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). <br />
<br />
** For example: Using acml blas for CPU and cublas for GPU computing<br />
<br />
cd fupermod/<br />
<br />
mkdir acml_config <br />
cd acml_config<br />
./configure --with-cblas=acml<br />
<br />
mkdir cuda_config <br />
cd cuda_config<br />
./configure --with-cblas=cuda<br />
<br />
* Building performance model:<br />
<br />
** rankfile is for processing binding, and appfile tells mpi what programs to launch<br />
$ mpirun -rf rankfile -app appfile_fpm<br />
<br />
** example of a rankfile:<br />
rank 0=ig.icl.utk.edu slot=0:0<br />
rank 1=ig.icl.utk.edu slot=0:1<br />
...<br />
<br />
** example of an appfile for building functional permanence model (FPM):<br />
# GPU<br />
# e.g. Linking against cublas, and fupermod is configured under cublas_config<br />
# suboption g=0 means device 0 is selected for computing<br />
-host localhost -np 1 $HOME/fupermod/cublas_config/tools/builder -l $HOME/fupermod/cublas_config/routines/mxm/.libs/libmxm_col.so -o k=640,g=0 -U10000 -s10<br />
# -----------------------------------------------------------------------------<br />
# CPU<br />
# e.g. Linking against acml, and fupermod is configured under acml_config<br />
-host localhost -np 47 $HOME/fupermod/acml_config/tools/builder -l $HOME/fupermod/acml_config/routines/mxm/.libs/libmxm_col.so -o k=640 -U10000 -s10<br />
<br />
* Data partitioning<br />
<br />
** matrix size D = N x N, N = sqrt(D), and machinefile lists the nodes participating in the computing<br />
<br />
$ fupermod/tools/partitioner -l fupermod/routines/mxm/.libs/libmxm_col.so -D10000 -o N=100 -m machinefile<br />
<br />
* Running matrix multiplication<br />
<br />
$ mpirun -rf rankfile -app appfile_mxm<br />
<br />
** example of an appfile for matrix multiplication<br />
# GPU<br />
# Assuming fupermod is configured under cublas_config, linking against cublas<br />
# -g0 means device 0 is selected for computing<br />
-host localhost -np 1 $HOME/fupermod/cublas_config/routines/mxm/mxm_col -k640 -g0 -m machinefile<br />
#--------------------------------------------------------------------------------------------------------<br />
# CPU<br />
# Assuming fupermod is configured under acml_config, linking against acml<br />
-host localhost -np 47 $HOME/fupermod/acml_config/routines/mxm/mxm_col -k640 -m machinefile</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=UTK_multicores_%2B_GPU&diff=737UTK multicores + GPU2012-07-12T10:49:29Z<p>Zhongziming: </p>
<hr />
<div>== List of machines ==<br />
http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180<br />
<br />
== Display a list of available GPUs ==<br />
$ nvidia-smi -L<br />
<br />
== Using Fupermod on hybrid multicore/GPUs node ==<br />
* Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). <br />
<br />
** For example: Using acml blas for CPU and cublas for GPU computing<br />
<br />
cd fupermod/<br />
<br />
mkdir acml_config <br />
cd acml_config<br />
./configure --with-cblas=acml<br />
<br />
mkdir cuda_config <br />
cd cuda_config<br />
./configure --with-cblas=cuda<br />
<br />
* Building performance model:<br />
<br />
** rankfile is for processing binding, and appfile tells mpi what programs to launch<br />
$ mpirun -rf rankfile -app appfile_fpm<br />
<br />
** example of a rankfile:<br />
rank 0=ig.icl.utk.edu slot=0:0<br />
rank 1=ig.icl.utk.edu slot=0:1<br />
...<br />
<br />
** example of a appfile for building functional permanence model (FPM):<br />
# GPU<br />
# e.g. Linking against cublas, and fupermod is configured under cublas_config<br />
# suboption g=0 means device 0 is selected for computing<br />
-host localhost -np 1 $HOME/fupermod/cublas_config/tools/builder -l $HOME/fupermod/cublas_config/routines/mxm/.libs/libmxm_col.so -o k=640,g=0 -U10000 -s10<br />
# -----------------------------------------------------------------------------<br />
# CPU<br />
# e.g. Linking against acml, and fupermod is configured under acml_config<br />
-host localhost -np 47 $HOME/fupermod/acml_config/tools/builder -l $HOME/fupermod/acml_config/routines/mxm/.libs/libmxm_col.so -o k=640 -U10000 -s10</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=UTK_multicores_%2B_GPU&diff=736UTK multicores + GPU2012-07-12T10:48:45Z<p>Zhongziming: </p>
<hr />
<div>== List of machines ==<br />
http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180<br />
<br />
== Display a list of available GPUs ==<br />
$ nvidia-smi -L<br />
<br />
== Using Fupermod on hybrid multicore/GPUs node ==<br />
* Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). <br />
<br />
** For example: Using acml blas for CPU and cublas for GPU computing<br />
<br />
cd fupermod/<br />
<br />
mkdir acml_config <br />
cd acml_config<br />
./configure --with-cblas=acml<br />
<br />
mkdir cuda_config <br />
cd cuda_config<br />
./configure --with-cblas=cuda<br />
<br />
* Building performance model:<br />
<br />
** rankfile is for processing binding, and appfile tells mpi what programs to launch<br />
$ mpirun -rf rankfile -app appfile_fpm<br />
<br />
** example of a rankfile:<br />
rank 0=ig.icl.utk.edu slot=0:0<br />
rank 1=ig.icl.utk.edu slot=0:1<br />
...<br />
<br />
** example of a appfile for building functional permanence model (FPM):<br />
# GPU<br />
# e.g. Linking against cublas, and fupermod is configured under cublas_config<br />
# suboption g=0 means device 0 is selected for computing<br />
-host localhost -np 1 $HOME/fupermod/cublas_config/tools/builder -l $HOME/fupermod/cublas_config/routines/mxm/.libs/libmxm_col.so -o k=640,g=0 -U10000 -s10<br />
<br />
# CPU<br />
# e.g. Linking against acml, and fupermod is configured under acml_config<br />
-host localhost -np 47 $HOME/fupermod/acml_config/tools/builder -l $HOME/fupermod/acml_config/routines/mxm/.libs/libmxm_col.so -o k=640 -U10000 -s10</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=UTK_multicores_%2B_GPU&diff=735UTK multicores + GPU2012-07-12T10:48:30Z<p>Zhongziming: </p>
<hr />
<div>== List of machines ==<br />
http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180<br />
<br />
== Display a list of available GPUs ==<br />
$ nvidia-smi -L<br />
<br />
== Using Fupermod on hybrid multicore/GPUs node ==<br />
* Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). <br />
<br />
** For example: Using acml blas for CPU and cublas for GPU computing<br />
<br />
cd fupermod/<br />
<br />
mkdir acml_config <br />
cd acml_config<br />
./configure --with-cblas=acml<br />
<br />
mkdir cuda_config <br />
cd cuda_config<br />
./configure --with-cblas=cuda<br />
<br />
* Building performance model:<br />
<br />
** rankfile is for processing binding, and appfile tells mpi what programs to launch<br />
$ mpirun -rf rankfile -app appfile_fpm<br />
<br />
** example of a rankfile:<br />
rank 0=ig.icl.utk.edu slot=0:0<br />
rank 1=ig.icl.utk.edu slot=0:1<br />
...<br />
<br />
** example of a appfile for building functional permanence model (FPM):<br />
# GPU<br />
# e.g. Linking against cublas, and fupermod is configured under cublas_config<br />
# suboption g=0 means device 0 is selected for computing<br />
-host localhost -np 1 $HOME/fupermod/cublas_config/tools/builder -l $HOME/fupermod/cublas_config/routines/mxm/.libs/libmxm_col.so -o k=640,g=0 -U10000 -s10<br />
#---------------------------------------------------------------------------------------------------------------------------------------------------------------<br />
<br />
# CPU<br />
# e.g. Linking against acml, and fupermod is configured under acml_config<br />
-host localhost -np 47 $HOME/fupermod/acml_config/tools/builder -l $HOME/fupermod/acml_config/routines/mxm/.libs/libmxm_col.so -o k=640 -U10000 -s10</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=UTK_multicores_%2B_GPU&diff=734UTK multicores + GPU2012-07-12T10:47:34Z<p>Zhongziming: </p>
<hr />
<div>== List of machines ==<br />
http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180<br />
<br />
== Display a list of available GPUs ==<br />
$ nvidia-smi -L<br />
<br />
== Using Fupermod on hybrid multicore/GPUs node ==<br />
* Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). <br />
** For example: Using acml blas for CPU and cublas for GPU computing<br />
<br />
cd fupermod/<br />
<br />
mkdir acml_config <br />
cd acml_config<br />
./configure --with-cblas=acml<br />
<br />
mkdir cuda_config <br />
cd cuda_config<br />
./configure --with-cblas=cuda<br />
<br />
* Building performance model:<br />
** rankfile is for processing binding, and appfile tells mpi what programs to launch<br />
$ mpirun -rf rankfile -app appfile_fpm<br />
<br />
** example of a rankfile:<br />
rank 0=ig.icl.utk.edu slot=0:0<br />
rank 1=ig.icl.utk.edu slot=0:1<br />
...<br />
<br />
** example of a appfile for building functional permanence model (FPM):<br />
# GPU<br />
# e.g. Linking against cublas, and fupermod is configured under cublas_config<br />
# suboption g=0 means device 0 is selected for computing<br />
<br />
-host localhost -np 1 $HOME/fupermod/cublas_config/tools/builder -l $HOME/fupermod/cublas_config/routines/mxm/.libs/libmxm_col.so -o k=640,g=0 -U10000 -s10<br />
<br />
#---------------------------------------------------------------------------------------------------------------------------------------------------------------<br />
<br />
# CPU<br />
# e.g. Linking against acml, and fupermod is configured under acml_config<br />
<br />
-host localhost -np 47 $HOME/fupermod/acml_config/tools/builder -l $HOME/fupermod/acml_config/routines/mxm/.libs/libmxm_col.so -o k=640 -U10000 -s10</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=UTK_multicores_%2B_GPU&diff=733UTK multicores + GPU2012-07-12T10:45:38Z<p>Zhongziming: </p>
<hr />
<div>== List of machines ==<br />
http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180<br />
<br />
== Display a list of available GPUs ==<br />
$ nvidia-smi -L<br />
<br />
== Using Fupermod on hybrid multicore/GPUs node ==<br />
* Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). <br />
** For example: Using acml blas for CPU and cublas for GPU computing<br />
<br />
cd fupermod/<br />
<br />
mkdir acml_config <br />
cd acml_config<br />
./configure --with-cblas=acml<br />
<br />
mkdir cuda_config <br />
cd cuda_config<br />
./configure --with-cblas=cuda<br />
<br />
* Building performance model:<br />
** rankfile is for processing binding, and appfile tells mpi what programs to launch<br />
$ mpirun -rf rankfile -app appfile_fpm<br />
<br />
example of a rankfile:<br />
rank 0=ig.icl.utk.edu slot=0:0<br />
rank 1=ig.icl.utk.edu slot=0:1<br />
.<br />
.<br />
.</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=UTK_multicores_%2B_GPU&diff=732UTK multicores + GPU2012-07-12T10:44:06Z<p>Zhongziming: </p>
<hr />
<div>== List of machines ==<br />
http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180<br />
<br />
== Display a list of available GPUs ==<br />
$ nvidia-smi -L<br />
<br />
== Using Fupermod on hybrid multicore/GPUs node ==<br />
* Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). <br />
** For example: Using acml blas for CPU and cublas for GPU computing<br />
<br />
cd fupermod/<br />
<br />
mkdir acml_config <br />
cd acml_config<br />
./configure --with-cblas=acml<br />
<br />
mkdir cuda_config <br />
cd cuda_config<br />
./configure --with-cblas=cuda<br />
<br />
* Building performance model:<br />
** rankfile is for processing binding, appfile tell mpi what programs to launch<br />
$ mpirun -rf rankfile -app appfile_fpm</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=UTK_multicores_%2B_GPU&diff=731UTK multicores + GPU2012-07-12T10:42:16Z<p>Zhongziming: </p>
<hr />
<div>== List of machines ==<br />
http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180<br />
<br />
== Display a list of available GPUs ==<br />
$ nvidia-smi -L<br />
<br />
== Using Fupermod on hybrid multicore/GPUs node ==<br />
* Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). <br />
** For example: Using acml blas for CPU and cublas for GPU computing<br />
<br />
cd fupermod/<br />
<br />
mkdir acml_config <br />
cd acml_config<br />
./configure --with-cblas=acml<br />
<br />
mkdir cuda_config <br />
cd cuda_config<br />
./configure --with-cblas=cuda<br />
<br />
* Build performance model:<br />
<br />
/*<br />
* rankfile is for process binding<br />
* appfile tells what processes will execute<br />
*/<br />
<br />
$ mpirun -rf rankfile -app appfile_fpm</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=UTK_multicores_%2B_GPU&diff=730UTK multicores + GPU2012-07-12T10:40:04Z<p>Zhongziming: </p>
<hr />
<div>== List of machines ==<br />
http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180<br />
<br />
== Display a list of available GPUs ==<br />
$ nvidia-smi -L<br />
<br />
== Using Fupermod on hybrid multicore/GPUs node ==<br />
*Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). <br />
**For example: Using acml blas for CPU and cublas for GPU computing<br />
<br />
cd fupermod/<br />
<br />
mkdir acml_config <br />
cd acml_config<br />
./configure --with-cblas=acml<br />
<br />
mkdir cuda_config <br />
cd cuda_config<br />
./configure --with-cblas=cuda</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=UTK_multicores_%2B_GPU&diff=729UTK multicores + GPU2012-07-12T10:38:57Z<p>Zhongziming: </p>
<hr />
<div>== List of machines ==<br />
http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180<br />
<br />
== Display a list of available GPUs ==<br />
$ nvidia-smi -L<br />
<br />
== Using Fupermod on hybrid multicore/GPUs node ==<br />
*Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). For example:<br />
<br />
cd fupermod/<br />
<br />
** Using acml for CPU compuing<br />
mkdir acml_config <br />
cd acml_config<br />
./configure --with-cblas=acml<br />
<br />
** Using cublas for GPU computing */ <br />
mkdir cuda_config <br />
cd cuda_config<br />
./configure --with-cblas=cuda</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=UTK_multicores_%2B_GPU&diff=728UTK multicores + GPU2012-07-12T10:38:26Z<p>Zhongziming: </p>
<hr />
<div>== List of machines ==<br />
http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180<br />
<br />
== Display a list of available GPUs ==<br />
$ nvidia-smi -L<br />
<br />
== Using Fupermod on hybrid multicore/GPUs node ==<br />
*Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). For example:<br />
<br />
cd fupermod/<br />
<br />
/* Using acml for CPU compuing*/<br />
mkdir acml_config <br />
cd acml_config<br />
./configure --with-cblas=acml<br />
<br />
/* Using cublas for GPU computing */ <br />
mkdir cuda_config <br />
cd cuda_config<br />
./configure --with-cblas=cuda</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=UTK_multicores_%2B_GPU&diff=727UTK multicores + GPU2012-07-12T10:38:06Z<p>Zhongziming: </p>
<hr />
<div>== List of machines ==<br />
http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180<br />
<br />
== Display a list of available GPUs ==<br />
$ nvidia-smi -L<br />
<br />
== Using Fupermod on hybrid multicore/GPUs node ==<br />
*Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). <br />
<br />
For example:<br />
cd fupermod/<br />
<br />
/* Using acml for CPU compuing*/<br />
mkdir acml_config <br />
cd acml_config<br />
./configure --with-cblas=acml<br />
<br />
/* Using cublas for GPU computing */ <br />
mkdir cuda_config <br />
cd cuda_config<br />
./configure --with-cblas=cuda</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=UTK_multicores_%2B_GPU&diff=726UTK multicores + GPU2012-07-12T10:37:11Z<p>Zhongziming: </p>
<hr />
<div>== List of machines ==<br />
http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180<br />
<br />
== Display a list of available GPUs ==<br />
$ nvidia-smi -L<br />
<br />
== Using Fupermod on hybrid multicore/GPUs node ==<br />
*Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). <br />
<br />
For example:<br />
<br />
cd fupermod/<br />
<br />
/* Using acml for CPU compuing*/<br />
mkdir acml_config <br />
cd acml_config<br />
./configure --with-cblas=acml<br />
<br />
/* Using cublas for GPU computing */ <br />
mkdir cuda_config <br />
cd cuda_config<br />
./configure --with-cblas=cuda</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=UTK_multicores_%2B_GPU&diff=725UTK multicores + GPU2012-07-12T10:36:37Z<p>Zhongziming: </p>
<hr />
<div>== List of machines ==<br />
http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180<br />
<br />
== Display a list of available GPUs ==<br />
$ nvidia-smi -L<br />
<br />
== Using Fupermod on hybrid multicore/GPUs node ==<br />
*Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). <br />
<br />
For example:<br />
<br />
cd fupermod/<br />
<br />
/* Using acml for CPU compuing*/<br />
mkdir acml_config <br />
cd acml_config<br />
./configure --with-cblas=acml<br />
<br />
/* Using cublas for GPU computing */ <br />
mkdir cuda_config <br />
cd cuda_config<br />
./configure --with-cblas=cuda</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=UTK_multicores_%2B_GPU&diff=724UTK multicores + GPU2012-07-12T10:35:51Z<p>Zhongziming: </p>
<hr />
<div>== List of machines ==<br />
http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180<br />
<br />
== Display a list of available GPUs ==<br />
$ nvidia-smi -L<br />
<br />
== Using Fupermod on hybrid multicore/GPUs node ==<br />
*Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). <br />
For example:<br />
<br />
cd fupermod/<br />
<br />
/* Using acml for CPU compuing*/<br />
mkdir acml_config <br />
cd acml_config<br />
./configure --with-cblas=acml<br />
<br />
/* Using cublas for GPU computing */ <br />
mkdir cuda_config <br />
cd cuda_config<br />
./configure --with-cblas=cuda</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=UTK_multicores_%2B_GPU&diff=723UTK multicores + GPU2012-07-12T10:35:31Z<p>Zhongziming: </p>
<hr />
<div>== List of machines ==<br />
http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180<br />
<br />
== Display a list of available GPUs ==<br />
$ nvidia-smi -L<br />
<br />
== Using Fupermod on hybrid multicore/GPUs node ==<br />
*Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). For example:<br />
<br />
cd fupermod/<br />
/* Using acml for CPU compuing*/<br />
mkdir acml_config <br />
cd acml_config<br />
./configure --with-cblas=acml<br />
/* Using cublas for GPU computing */ <br />
mkdir cuda_config <br />
cd cuda_config<br />
./configure --with-cblas=cuda</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=UTK_multicores_%2B_GPU&diff=722UTK multicores + GPU2012-07-12T10:34:21Z<p>Zhongziming: </p>
<hr />
<div>== List of machines ==<br />
http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180<br />
<br />
== Display a list of available GPUs ==<br />
$ nvidia-smi -L<br />
<br />
== Using Fupermod on hybrid multicore/GPUs node ==<br />
*Compiling : Create two directories for configuration with CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). For example:<br />
<br />
cd fupermod/<br />
<br />
mkdir acml_config /* Using acml for CPU compuing*/<br />
cd acml_config<br />
./configure --with-cblas=acml<br />
<br />
mkdir cuda_config /* Using cublas for GPU computing */<br />
cd cuda_config<br />
./configure --with-cblas=cuda</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=UTK_multicores_%2B_GPU&diff=721UTK multicores + GPU2012-07-12T10:33:46Z<p>Zhongziming: </p>
<hr />
<div>== List of machines ==<br />
http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180<br />
<br />
== Display a list of available GPUs ==<br />
$ nvidia-smi -L<br />
<br />
== Using Fupermod on hybrid multicore/GPUs node ==<br />
*Compiling : Create two directories for configuration with CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). For example:<br />
<br />
cd fupermod/<br />
<br />
/* Using acml for CPU compuing*/<br />
mkdir acml_config<br />
cd acml_config<br />
./configure --with-cblas=acml<br />
<br />
/* Using cublas for GPU computing */<br />
mkdir cuda_config<br />
cd cuda_config<br />
./configure --with-cblas=cuda</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=UTK_multicores_%2B_GPU&diff=720UTK multicores + GPU2012-07-12T10:31:41Z<p>Zhongziming: </p>
<hr />
<div>== List of machines ==<br />
http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180<br />
<br />
== Display a list of available GPUs ==<br />
$ nvidia-smi -L<br />
<br />
== Using Fupermod on hybrid multicore/GPUs node ==<br />
*Compiling : Create two directories for configuring with cblas (for CPU) cublas (for GPU). For example:<br />
/* Using acml for CPU compuing*/<br />
mkdir acml_config<br />
cd acml_config<br />
./configure --with-cblas=acml<br />
<br />
/* Using cublas for GPU computing */<br />
mkdir cuda_config<br />
cd cuda_config<br />
./configure --with-cblas=cuda</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=UTK_multicores_%2B_GPU&diff=719UTK multicores + GPU2012-07-12T10:22:34Z<p>Zhongziming: /* Using Fupermod on hybrid node */</p>
<hr />
<div>== List of machines ==<br />
http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180<br />
<br />
== Display a list of available GPUs ==<br />
$ nvidia-smi -L<br />
<br />
== Using Fupermod on hybrid node ==<br />
*Compiling<br />
Currently user need to compile the code for CPU and GPU seperately</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=UTK_multicores_%2B_GPU&diff=718UTK multicores + GPU2012-07-12T10:20:57Z<p>Zhongziming: </p>
<hr />
<div>== List of machines ==<br />
http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180<br />
<br />
== Display a list of available GPUs ==<br />
$ nvidia-smi -L<br />
<br />
== Using Fupermod on hybrid node ==<br />
*Compiling</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=UTK_multicores_%2B_GPU&diff=717UTK multicores + GPU2012-07-12T10:11:59Z<p>Zhongziming: </p>
<hr />
<div>== List of machines ==<br />
http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180<br />
<br />
== Display a list of available GPUs ==<br />
$ nvidia-smi -L</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=UTK_multicores_%2B_GPU&diff=716UTK multicores + GPU2012-07-12T10:06:19Z<p>Zhongziming: </p>
<hr />
<div>== List of machines ==<br />
http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180<br />
<br />
== Getting the info of GPUs on a node ==<br />
nvidia-smi -L</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=UTK_multicores_%2B_GPU&diff=715UTK multicores + GPU2012-07-12T10:05:31Z<p>Zhongziming: </p>
<hr />
<div><br />
== List of machines ==<br />
http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180<br />
<br />
== Checking number and type of GPUs ==<br />
nvidia-smi -L</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=UTK_multicores_%2B_GPU&diff=714UTK multicores + GPU2012-07-12T10:04:59Z<p>Zhongziming: </p>
<hr />
<div>== List of machines ==<br />
http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 UTK machines<br />
<br />
== Checking number and type of GPUs ==<br />
nvidia-smi -L</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=Grid5000&diff=596Grid50002011-05-03T18:35:04Z<p>Zhongziming: /* Setting up new deploy image */</p>
<hr />
<div>https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home <br />
<br />
[https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY]<br />
<br />
== Login, job submission, deployment of image ==<br />
<br />
*Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] <br />
*Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site:<br />
<br />
<source lang="bash"><br />
access_$ ssh frontend.SITE2<br />
</source> <br />
<br />
*There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. <br />
*Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. <br />
*Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: <br />
**'''oarstat''' - queue status <br />
**'''oarsub''' - job submission <br />
**'''oardel''' - job removal<br />
<br />
Interactive job on deployed images: <source lang="bash"><br />
fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"']<br />
</source> Batch job on installed images: <source lang="bash"><br />
fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"']<br />
</source> <br />
<br />
*The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here]<br />
<br />
Loading: <source lang="bash"><br />
fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES <br />
</source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. <br />
<br />
== Compiling and running MPI applications ==<br />
<br />
*Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) <br />
*Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] <br />
**mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`)<br />
<br />
== Setting up new deploy image ==<br />
<br />
oarsub -I -t deploy -l nodes=1,walltime=12<br />
kadeploy3 -e lenny-x64-nfs -f $OAR_FILE_NODES -k<br />
ssh root@`head -n 1 $OAR_NODEFILE`<br />
<br />
edit /etc/apt/sources.list <br />
<br />
apt-get update<br />
apt-get upgrade<br />
<br />
apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion<br />
<br />
<br />
Compiled for sources by us: <br />
<br />
* gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/)<br />
./configure && make && make install<br />
* mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads)<br />
./configure --enable-shared --enable-sharedlibs=gcc<br />
make && make install<br />
<br />
Mpich2 installed to:<br />
Installing MPE2 include files to /usr/local/include<br />
Installing MPE2 libraries to /usr/local/lib<br />
Installing MPE2 utility programs to /usr/local/bin<br />
Installing MPE2 configuration files to /usr/local/etc<br />
Installing MPE2 system utility programs to /usr/local/sbin<br />
Installing MPE2 man to /usr/local/share/man<br />
Installing MPE2 html to /usr/local/share/doc/<br />
Installed MPE2 in /usr/local<br />
<br />
* hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/)<br />
compile from sources. To get xml support install libxml2-dev and pkg-config<br />
apt-get install libxml2-dev pkg-config<br />
tar -xzvf hwloc-1.1.1.tar.gz<br />
cd hwloc-1.1.1<br />
./configure && make && make install<br />
<br />
Cleanup <br />
<br />
apt-get clean<br />
rm /etc/udev/rules.d/*-persistent-net.rules<br />
<br />
Make image <br />
<br />
ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz<br />
<br />
make appropriate .env file.<br />
kaenv3 -p lenny-x64-nfs -u deploy > lenny-x64-custom-2.3.env</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=Grid5000&diff=595Grid50002011-05-03T16:51:40Z<p>Zhongziming: /* Setting up new deploy image */</p>
<hr />
<div>https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home <br />
<br />
[https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY]<br />
<br />
== Login, job submission, deployment of image ==<br />
<br />
*Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] <br />
*Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site:<br />
<br />
<source lang="bash"><br />
access_$ ssh frontend.SITE2<br />
</source> <br />
<br />
*There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. <br />
*Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. <br />
*Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: <br />
**'''oarstat''' - queue status <br />
**'''oarsub''' - job submission <br />
**'''oardel''' - job removal<br />
<br />
Interactive job on deployed images: <source lang="bash"><br />
fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"']<br />
</source> Batch job on installed images: <source lang="bash"><br />
fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"']<br />
</source> <br />
<br />
*The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here]<br />
<br />
Loading: <source lang="bash"><br />
fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES <br />
</source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. <br />
<br />
== Compiling and running MPI applications ==<br />
<br />
*Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) <br />
*Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] <br />
**mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`)<br />
<br />
== Setting up new deploy image ==<br />
<br />
oarsub -I -t deploy -l nodes=1,walltime=12<br />
kadeploy3 -e lenny-x64-nfs -f $OAR_FILE_NODES -k<br />
ssh root@`head -n 1 $OAR_NODEFILE`<br />
<br />
edit /etc/apt/sources.list <br />
<br />
apt-get update<br />
apt-get upgrade<br />
<br />
apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion<br />
<br />
<br />
Compiled for sources by us: <br />
<br />
* gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/)<br />
./configure && make && make install<br />
* mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads)<br />
./configure --enable-shared --enable-sharedlibs=gcc<br />
make && make install<br />
<br />
Mpich2 installed to:<br />
Installing MPE2 include files to /usr/local/include<br />
Installing MPE2 libraries to /usr/local/lib<br />
Installing MPE2 utility programs to /usr/local/bin<br />
Installing MPE2 configuration files to /usr/local/etc<br />
Installing MPE2 system utility programs to /usr/local/sbin<br />
Installing MPE2 man to /usr/local/share/man<br />
Installing MPE2 html to /usr/local/share/doc/<br />
Installed MPE2 in /usr/local<br />
<br />
* hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/)<br />
compile from sources. To get xml support install libxml2-dev and pkg-config<br />
apt-get install libxml2-dev pkg-config<br />
tar -xzvf hwloc-1.1.1.tar.gz<br />
cd hwloc-1.1.1<br />
./configure && make && make install<br />
<br />
Cleanup <br />
<br />
apt-get clean<br />
rm /etc/udev/rules.d/*-persistent-net.rules<br />
<br />
Make image <br />
<br />
ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz<br />
<br />
make appropriate .env file.</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=Grid5000&diff=594Grid50002011-05-03T16:43:28Z<p>Zhongziming: /* Setting up new deploy image */</p>
<hr />
<div>https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home <br />
<br />
[https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY]<br />
<br />
== Login, job submission, deployment of image ==<br />
<br />
*Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] <br />
*Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site:<br />
<br />
<source lang="bash"><br />
access_$ ssh frontend.SITE2<br />
</source> <br />
<br />
*There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. <br />
*Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. <br />
*Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: <br />
**'''oarstat''' - queue status <br />
**'''oarsub''' - job submission <br />
**'''oardel''' - job removal<br />
<br />
Interactive job on deployed images: <source lang="bash"><br />
fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"']<br />
</source> Batch job on installed images: <source lang="bash"><br />
fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"']<br />
</source> <br />
<br />
*The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here]<br />
<br />
Loading: <source lang="bash"><br />
fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES <br />
</source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. <br />
<br />
== Compiling and running MPI applications ==<br />
<br />
*Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) <br />
*Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] <br />
**mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`)<br />
<br />
== Setting up new deploy image ==<br />
<br />
oarsub -I -t deploy -l nodes=1,walltime=12<br />
kadeploy3 -e lenny-x64-nfs -f $OAR_FILE_NODES -k<br />
ssh root@`head -n 1 $OAR_NODEFILE`<br />
<br />
edit /etc/apt/sources.list <br />
<br />
apt-get update<br />
apt-get upgrade<br />
<br />
apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion<br />
<br />
<br />
Compiled for sources by us: <br />
<br />
* gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/)<br />
./configure && make && make install<br />
* mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads)<br />
./configure --enable-shared --enable-sharedlibs=gcc<br />
make && make install<br />
<br />
Mpich2 installed to:<br />
Installing MPE2 include files to /usr/local/include<br />
Installing MPE2 libraries to /usr/local/lib<br />
Installing MPE2 utility programs to /usr/local/bin<br />
Installing MPE2 configuration files to /usr/local/etc<br />
Installing MPE2 system utility programs to /usr/local/sbin<br />
Installing MPE2 man to /usr/local/share/man<br />
Installing MPE2 html to /usr/local/share/doc/<br />
Installed MPE2 in /usr/local<br />
<br />
* hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/)<br />
compile from sources. To get xml support install libxml2-dev and pkg-config<br />
apt-get install libxml2-dev and pkg-config<br />
tar -xzvf hwloc-1.1.1.tar.gz<br />
cd hwloc-1.1.1<br />
./configure && make && make install<br />
<br />
Cleanup <br />
<br />
apt-get clean<br />
rm /etc/udev/rules.d/*-persistent-net.rules<br />
<br />
Make image <br />
<br />
ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz<br />
<br />
make appropriate .env file.</div>Zhongziminghttps://hcl.ucd.ie/wiki/index.php?title=Grid5000&diff=593Grid50002011-05-03T16:39:38Z<p>Zhongziming: /* Setting up new deploy image */</p>
<hr />
<div>https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home <br />
<br />
[https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY]<br />
<br />
== Login, job submission, deployment of image ==<br />
<br />
*Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] <br />
*Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site:<br />
<br />
<source lang="bash"><br />
access_$ ssh frontend.SITE2<br />
</source> <br />
<br />
*There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. <br />
*Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. <br />
*Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: <br />
**'''oarstat''' - queue status <br />
**'''oarsub''' - job submission <br />
**'''oardel''' - job removal<br />
<br />
Interactive job on deployed images: <source lang="bash"><br />
fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"']<br />
</source> Batch job on installed images: <source lang="bash"><br />
fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"']<br />
</source> <br />
<br />
*The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here]<br />
<br />
Loading: <source lang="bash"><br />
fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES <br />
</source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. <br />
<br />
== Compiling and running MPI applications ==<br />
<br />
*Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) <br />
*Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] <br />
**mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`)<br />
<br />
== Setting up new deploy image ==<br />
<br />
oarsub -I -t deploy -l nodes=1,walltime=12<br />
kadeploy3 -e lenny-x64-nfs -f $OAR_FILE_NODES -k<br />
ssh root@`head -n 1 $OAR_NODEFILE`<br />
<br />
edit /etc/apt/sources.list <br />
<br />
apt-get update<br />
apt-get upgrade<br />
<br />
apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion<br />
<br />
<br />
Compiled for sources by us: <br />
<br />
* gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/)<br />
./configure && make && make install<br />
* mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads)<br />
./configure --enable-shared --enable-sharedlibs=gcc<br />
make && make install<br />
<br />
Mpich2 installed to:<br />
Installing MPE2 include files to /usr/local/include<br />
Installing MPE2 libraries to /usr/local/lib<br />
Installing MPE2 utility programs to /usr/local/bin<br />
Installing MPE2 configuration files to /usr/local/etc<br />
Installing MPE2 system utility programs to /usr/local/sbin<br />
Installing MPE2 man to /usr/local/share/man<br />
Installing MPE2 html to /usr/local/share/doc/<br />
Installed MPE2 in /usr/local<br />
<br />
* hwloc (and lstopo)<br />
compile from sources. To get xml support install libxml2-dev and pkg-config<br />
apt-get install libxml2-dev and pkg-config<br />
tar -xzvf hwloc-1.1.1.tar.gz<br />
cd hwloc-1.1.1<br />
./configure && make && make install<br />
<br />
Cleanup <br />
<br />
apt-get clean<br />
rm /etc/udev/rules.d/*-persistent-net.rules<br />
<br />
Make image <br />
<br />
ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz<br />
<br />
make appropriate .env file.</div>Zhongziming