Difference between revisions of "Grid5000"
From HCL
					
										
					
					Zhongziming (talk | contribs)  (→Setting up new deploy image)  | 
				|||
| Line 1: | Line 1: | ||
https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home    | https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home    | ||
| − | [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY]  | + | [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY]    | 
== Login, job submission, deployment of image  ==  | == Login, job submission, deployment of image  ==  | ||
| Line 50: | Line 50: | ||
  apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion  |   apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion  | ||
| + | <br> Compiled for sources by us:   | ||
| − | + | *gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/)  | |
| + | |||
| + |  ./configure && make && make install  | ||
| + | |||
| + | *mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads)  | ||
| − | |||
| − | |||
| − | |||
  ./configure --enable-shared --enable-sharedlibs=gcc  |   ./configure --enable-shared --enable-sharedlibs=gcc  | ||
| − |   make && make install  | + |   make && make install  | 
| + | |||
| + | Mpich2 installed to:   | ||
| − | |||
  Installing MPE2 include files to /usr/local/include  |   Installing MPE2 include files to /usr/local/include  | ||
  Installing MPE2 libraries to /usr/local/lib  |   Installing MPE2 libraries to /usr/local/lib  | ||
| Line 69: | Line 72: | ||
  Installed MPE2 in /usr/local  |   Installed MPE2 in /usr/local  | ||
| − | * hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/)  | + | *hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/)  | 
| − | compile from sources. To get xml support install libxml2-dev and pkg-config  | + | |
| + | compile from sources. To get xml support install libxml2-dev and pkg-config    | ||
| + | |||
  apt-get install libxml2-dev pkg-config  |   apt-get install libxml2-dev pkg-config  | ||
  tar -xzvf hwloc-1.1.1.tar.gz  |   tar -xzvf hwloc-1.1.1.tar.gz  | ||
  cd hwloc-1.1.1  |   cd hwloc-1.1.1  | ||
| − |   ./configure && make && make install  | + |   ./configure && make && make install  | 
Cleanup    | Cleanup    | ||
| Line 85: | Line 90: | ||
  ssh root@'''node''' tgz-g5k > $HOME/grid5000/'''imagename'''.tgz  |   ssh root@'''node''' tgz-g5k > $HOME/grid5000/'''imagename'''.tgz  | ||
| − | make appropriate .env file.  | + | make appropriate .env file.    | 
| − |   kaenv3 -p lenny-x64-nfs -u deploy   | + | |
| + |   kaenv3 -p lenny-x64-nfs -u deploy > lenny-x64-custom-2.3.env  | ||
| + | |||
| + | <br>   | ||
| + | |||
| + | == GotoBLAS2  ==  | ||
| + | When compiling gotoblas on a node without direct internet access get this error: <source lang="">wget http://www.netlib.org/lapack/lapack-3.1.1.tgz  | ||
| + | --2011-05-19 03:11:03--  http://www.netlib.org/lapack/lapack-3.1.1.tgz  | ||
| + | Resolving www.netlib.org... 160.36.58.108  | ||
| + | Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out.  | ||
| + | Retrying.  | ||
| + | |||
| + | --2011-05-19 03:14:13--  (try: 2)  http://www.netlib.org/lapack/lapack-3.1.1.tgz  | ||
| + | Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out.  | ||
| + | Retrying.  | ||
| + | ...</source>  | ||
| + | |||
| + | Fix by downloading http://www.netlib.org/lapack/lapack-3.1.1.tgz to the GotoBLAS2 source directory and editing this line in the Makefile  | ||
| + | 184c184  | ||
| + | < 	-wget http://www.netlib.org/lapack/lapack-3.1.1.tgz  | ||
| + | ---  | ||
| + | > #	-wget http://www.netlib.org/lapack/lapack-3.1.1.tgz  | ||
| + | |||
| + | |||
| + | GotoBLAS needs to be compiled individualy for each unique machine - ie each cluster. add to .bashrc   | ||
| + | |||
| + |  export CLUSTER=`hostname |sed 's/\([a-z]*\).*/\1/'`  | ||
Revision as of 02:37, 19 May 2011
https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home
Contents
Login, job submission, deployment of image
- Select sites and clusters for experiments, using information on the Grid5000 network and the Status page
 - Access is provided via access nodes access.SITE.grid5000.fr marked here as accessible from everywhere via ssh with keyboard-interactive authentication method. As soon as you are on one of the sites, you can directly ssh frontend node of any other site:
 
access_$ ssh frontend.SITE2
- There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available.
 - Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it scp, sftp, rsync between access or frontend nodes.
 - Jobs are run from the frondend nodes, using a PBS-like system OAR. Basic commands: 
- oarstat - queue status
 - oarsub - job submission
 - oardel - job removal
 
 
fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"']
fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"']
- The image to deploy can be created and loaded with help of a Systemimager-like system Kadeploy. Creating: described here
 
fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES
Compiling and running MPI applications
- Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`)
 - Running MPI applications is described here 
- mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`)
 
 
Setting up new deploy image
oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e lenny-x64-nfs -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE`
edit /etc/apt/sources.list
apt-get update apt-get upgrade
apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion
 Compiled for sources by us: 
- gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/)
 
./configure && make && make install
./configure --enable-shared --enable-sharedlibs=gcc make && make install
Mpich2 installed to:
Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local
- hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/)
 
compile from sources. To get xml support install libxml2-dev and pkg-config
apt-get install libxml2-dev pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure && make && make install
Cleanup
apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules
Make image
ssh root@node tgz-g5k > $HOME/grid5000/imagename.tgz
make appropriate .env file.
kaenv3 -p lenny-x64-nfs -u deploy > lenny-x64-custom-2.3.env
 
GotoBLAS2
When compiling gotoblas on a node without direct internet access get this error:wget http://www.netlib.org/lapack/lapack-3.1.1.tgz
--2011-05-19 03:11:03--  http://www.netlib.org/lapack/lapack-3.1.1.tgz
Resolving www.netlib.org... 160.36.58.108
Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out.
Retrying.
--2011-05-19 03:14:13--  (try: 2)  http://www.netlib.org/lapack/lapack-3.1.1.tgz
Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out.
Retrying.
...Fix by downloading http://www.netlib.org/lapack/lapack-3.1.1.tgz to the GotoBLAS2 source directory and editing this line in the Makefile 184c184 < -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --- > # -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz
GotoBLAS needs to be compiled individualy for each unique machine - ie each cluster. add to .bashrc 
export CLUSTER=`hostname |sed 's/\([a-z]*\).*/\1/'`