OpenMPI

From HCL
Revision as of 10:03, 12 January 2011 by Zhongziming (talk | contribs)

Jump to: navigation, search

http://www.open-mpi.org/faq/

MCA parameter files

If you want to permanently use some MCA parameter settings, you can create a file $HOME/.openmpi/mca-params.conf, e.g.:

cat $HOME/.openmpi/mca-params.conf
btl_tcp_if_exclude = lo,eth1

Running applications on Multiprocessors/Multicores

Process can be bound to specific sockets and cores on nodes by choosing right options of mpirun.

Debugging applications on Multiprocessors/Multicores

     For example, launch your MPI application as normal with mpirun. Then login to the node(s) where your application is running and use the --pid option to gdb to attach to your application.
     An inelegant-but-functional technique commonly used with this method is to insert the following code in your application where you want to attach:
     {
         int i = 0;
         char hostname[256];
         gethostname(hostname, sizeof(hostname));
         printf("PID %d on %s ready for attach\n", getpid(), hostname);
         fflush(stdout);
         while (0 == i)
         sleep(5);
     }
     This code will output a line to stdout outputting the name of the host where the process is running and the PID to attach to. It will then spin on the sleep() function forever waiting for you to attach 
     with a debugger. Using sleep() as the inside of the loop means that the processor won't be pegged at 100% while waiting for you to attach.
     Once you attach with a debugger, go up the function stack until you are in this block of code (you'll likely attach during the sleep()) then set the variable i to a nonzero value. With GDB, the syntax 
     is:
       (gdb) set var i = 7
     Then set a breakpoint after your block of code and continue execution until the breakpoint is hit. Now you have control of your live MPI application and use the full functionality of the debugger.
     You can even add conditionals to only allow this "pause" in the application for specific MPI processes (e.g., MPI_COMM_WORLD rank 0, or whatever process is misbehaving). 
    • 2. Use mpirun to launch xterms (or equivalent) with serial debuggers.
     shell$ mpirun -np 4 xterm -e gdb my_mpi_application