Running LAM on CS Lab machines

You need to set up ssh so that you can ssh into lab machines without having to enter your password:
    Here's what to do:

     On some lab machine:

     % ssh-keygen -t dsa
       (accept the defaults, and just hit RETURN when asked for a passphrase)

     % cd ~/.ssh
     % cat id_dsa.pub  >> authorized_keys2    

     now try it out: ssh somemachinename 
Next, set your LAMRSH environment variable:
    # if you use bash:  add to your .bashrc or .bash_profile file:
    export LAMRSH=ssh

    # if you use another shell (you likely don't) add to your .cshrc  file:
    setenv LAMRSH ssh
The main steps that you will need to take to run MPI programs are the following:
  1. Boot lam
    To boot lam, you need to have a lamhosts file in your lam subdirectory, and run recon to see if lam can be started on all nodes in your hostfile:
     % recon -v lamhosts   # see the example below for this file's format 
         
    Then, to boot lam on the hosts in your host file (this has to be once at the beginning of each session you run mpi programs):
     % lamboot -v lamhosts 
  2. Run MPI applications
  3. mpitask
    As your MPI application runs, you can look at the state of the tasks by running mpitask. To continuously run this program every X seconds, use watch:
           # every 1 second, run mpitask
           %  watch -n 1  mpitask
          
  4. Shutdown lam
    After you are done running your MPI applications, you should shut down lam by running lamhalt, or if lamhalt doesn't completely clean things up, then try lamwipe:
     
         % lamhalt
         % lamwipe -v lamhosts 

An Example MPI application to try

Before you start coding, try copying over and running some of the sample MPI programs in /usr/share/doc/lam4-dev/examples/main/. For example, here is what you need to do to run the mandelbrot code (following instructions from the
lam tutorial):
  %  mkdir lam
  %  cd lam
  %  vi lamhosts	# add some hosts 
     
  %  recon -v lamhosts 
  %  lamboot -v lamhosts 
  %  tping -c1 N
  %  cp -r /usr/share/doc/lam3-dev/examples/main/mandelbrot .
  %  cd mandelbrot/
  %  gunzip *.gz
  %  mpicc -o master master.c 
  %  mpicc -o slave slave.c 
  %  vi appfile			# create appfile schema file: it describes
				# the application: each node and program
				# the node names come from 'recon -v lamhosts'
  %  cat appfile
     # 1 master, 5 slaves
     n0 master 
     n1-5 slave 

  %  mpirun -v appfile 		# run the MPI application
  %  xv mandel.out 		

  %  lamclean -v		# clean up lam stuff from this run
  %  mpirun -v appfile 		# re-run the mandelbrot app 
  %  lamhalt -v			# halt lam...you are done running lam apps
  %  lamwipe -v ../lamhosts     (if halt didn't work)
Try running this application with different numbers of slaves by modifying the appfile, and login to a node running slave programs, run top, and watch as slave program(s) are spawned and start running.

MPI Links

MPI Tutorial Links under "Another MPI Tutorial", and "Getting Started with LAM/MIP" are pretty helpful.
MPI 1 Standard. Other MPI documentation is availble here.
List of MPI functions from the MPI Standard