Using XSEDE

Setting up XSEDE and Comet Accounts
Using Comet
An Example to try on Comet
Help and Resources
XSEDE and Comet Account Set-up

Getting an xsede account

You should set up your xsede account if you have not yet done so: Follow these Steps:
  1. go to https://portal.xsede.org/
  2. choose "Create Account" button to request a new account:
    Organization: Swarthmore
    Department: Computer Science
    Registration Key:  pick the 6 first chars your swarthmore user name 
    
  3. When you have a choice to select a user name, pick your swarthmore user name (ex. tnewhal1 is mine). If your Swat user name is already in use, pick a different one and let me know what you picked.
It will take a day or so for your account to be activated.

Setting up Logging into xsede portal and Comet system

  1. go to https://portal.xsede.org/ and login with your user name, request a password.
  2. Select My XSEDE->Profile
  3. Choose Enroll in Duo in upper right

    Add a device. If you have a phone you carry in your pocket, use that. If not contact Andrew Ruether in ITS and he will give you a device to use and help you register it.

Log into Comet

The first time you log into comet, you need to do so through xsede:
ssh newhall@login.xsede.org
# enter passcode
gsissh comet
Once logged into comet, upload a public key from your cs account to comet (your public key is in ~/.ssh/id_rsa.pub). On comet:
mkdir .ssh
vi authorized_keys     # copy in your pub key from your cs account 
In our git guide is some information about generating keys on our system: generating keys

logging directly Comet

After adding your public key(s), you can directly ssh or scp to comet from a cs machine:
ssh newhall@comet.sdsc.edu
Using Comet and submitting jobs

Copying files

You can use scp to copy files between your cs account and comet. For example, from comet I can copy over a file or a whole subdirectory (and just change the source and destination to copy from comet to CS):
# copy over foo.c
scp newhall@cs.swarthmore.edu:/home/newhall/public/foo.c .

# WARNING: this does a recursive copy of all contents under the specified
# directory (my mpi_examples directory in this example):
scp -R newhall@cs.swarthmore.edu:/home/newhall/public/mpi_examples .
You can also create a single tar file of a set of files to easily copy over and then untar it once copied over. Here is my documentation about using tar. You can also look at the tar man page for more information.

Running Jobs

Comet runs the SLURM resource manager to submit jobs.
squeue            # list all jobs in the job queue (there will be lots)
sbatch  job.mpi   # to submit a batch job from a slurm script
                  # this will give you a jobId  for your job
squeue -u yourusername # list all jobs in the job queue that you own
scancel jobId     # to kill a queued job 

man slurm         # the man page for slurm

queues

For MPI use the debug queue when debugging and the compute queue for larger longer test runs. If you use Comet for your course project, it has gpu queues too.

slurm job script

The slurm job script specifies information about the job you are submitting. Some of these include the number of nodes, mpi processes, and an estimate at your job's runtime.

/share/apps/examples/  # example job scripts  (see mpi examples)

Luster file system

If you use Comet for data intensive computing that requires large input or output file storage, use the luster file system on Comet to store files.

The Comet User's Guide has a lot more information and examples.

Hello World Example to try out
I have a very simple example MPI program and slurm run script for submitting to the debug queue on Comet. You can try it out by doing the following:
# from comet, copy over my hello world example, untar and make it
scp /home/newhall/public/XSEDE_MPI.tar .
tar xvf XSEDE_MPI.tar
cd XSEDE_MPI
make

vi hello.mpi  # change the path name to helloworld to your path
              # (in mpi_run_rsh command line change newhall to your user name)

# submit your job 
sbatch hello.mpi

# check its status
squeue -u yourusername

# after it has run its output is in a file (vi, cat, less, ... to view)
less helloworldJOBID.out
The hello.mpi is an example slurm runscript that you can use as a starting point for other mpi applications you run. It submits the job to the debug queue, which is the one you want to use for testing before submitting longer experiment runs to the compute queue.

Useful Functions and Resources