Bulldogh
From Yale HPC Wiki
Contents |
Logging on
Access to bulldogh.wss.yale.edu is granted to all account owners and their users. The machines are only accessible from the Yale network. Anyone wishing to access the cluster from outside the Yale network will have to log onto another machine within Yale and then from there log onto bulldogh. The Pantheon log in boxes can be used for users without their own desktops. For security reasons, only Yale netids are allowed access. Temporary netids or netids for outside collaborators can be requested with the help of your Administrative Assistant.
For those outside of campus, you can connect to bulldogh using Yale's VPN service. More information is available here http://www.yale.edu/its/network/vpn.html.
About the system
Each compute node has four Intel "Woodcrest" cpu cores, 8GB of ram (Limited to 7.5GB for computation), and 40GB of local disk in /scratch. Each group has approximately 100GB of quota disk space between their whole group. If you exceed this limit you can email hpc@yale.edu and request it be increased. Logging into bulldogh.wss.yale.edu puts you on the login node. This node is for compiling, debugging, and submitting jobs. Please do not run long jobs on the login node. The entire system is monitored and viewable at http://hpc-status.wss.yale.edu
Compilers available
Currently the cluster supports the intel and gcc compilers. If you require additional compilers please email hpc@yale.edu. Both of these compilers are in the default path for all users. By default applications typically compile using the gcc compilers. If you wish to use the intel compilers, the binaries are listed below.
* Intel compilers version 9.1
o CC=icc (c compiler)
o CXX=icpc (c++ compiler)
o F77=ifort (fortran 77 compiler)
o FC=ifort (fortran 90 compiler)
o The intel debugger is available at: /usr/local/cluster/intel/compiler90/idb/9.0
The intel debugger manual is availabel here from intel's online website. It's probably best to get started here.
If you are using a "configure" build script, you might need to add these lines to your .bashrc file to build with the intel compilers.
export CC=icc export CXX=icpc export F77=ifort export FC=ifort
Generally this is all you need to do since the compiler libraries are dynamic and configured by us. If you need to specify static libraries, flags, or includes you may need to add these variables to your environment.
CFLAGS C compiler flags
LDFLAGS linker flags, e.g. -L<lib dir> if you have libraries in a
nonstandard directory <lib dir>
CPPFLAGS C/C++ preprocessor flags, e.g. -I<include dir> if you have
headers in a nonstandard directory <include dir>
CXXFLAGS C++ compiler flags
FFLAGS Fortran 77 compiler flags
The intel libraries and headers are in /usr/local/cluster/compilers/intel/ appropriately.
Libraries Available
At this time, only the intel math libraries are available on the cluster. If you wish to have additional libraries installed please email hpc@yale.edu.
Intel Math Kernel Library
The math kernel libraries are installed in /usr/local/cluster/lib/intel/
Documentation for the libraries can be found here: http://www.intel.com/software/products/mkl/docs/mklgs_lnx.htm
Queues
Each user has at least two queues available to them, a group queue and the general queue. Your group queue has a guaranteed number of processors for you to run on based upon your purchase. If you submit a job in your group queue it will run when the resources are available with no interruptions. If you choose to use the general queue, you have access to the entire unused portion of the cluster, but your jobs may be killed to make room for group priority jobs. Therefore breaking your jobs into small short runs is optimal to reduce the number of terminated jobs in the general queue.
In some cases you might want to move jobs between queues. For this use the qmove command. For example:
qmove general 5558
Will move the job 5558 to the general queue.
Submitting and working with Jobs
writing qsub scripts
Jobs can be submitted from the login node with the qsub command. Each job is simply a shell script that will be executed when resources are available. This is a minimal PBS script:
#!/bin/bash #PBS -q queue_name #PBS -l nodes=1:ppn=1 command to execute
This script will submit the job to queue queue_name and request the use of one node/one processor.
See our Writing_a_PBS_Script page further info.
your group queue_name
Your queue_name should be given to you when your account was created. You may be able to guess your queue_name from a list. To print the queue names run the command:
qmgr -c "print server"|grep "create queue"|cut -d " " -f3
See our Job Scheduler for more information about queues.
submitting jobs with qsub
The job is then submitted as follows:
qsub <shell_script>
qsub will return the number and name of the job. You can use the job number to then check on the status of the job using the command:
qstat <job number>
If you wish to kill or cancel this job use the qdel command. For example,
qdel <job number>
There are many different PBS directives that can be used. Another more involved example would be as follows:
#!/bin/sh
### Set the job name
#PBS -N myprogram
### Declare myprogram non-rerunable
#PBS -r n
### Combine standard error and standard out to one file.
#PBS -j oe
### Have PBS mail you results
#PBS -m ae
#PBS -M my.email@yale.edu
### Set the queue name, given to you when you get a reservation.
#PBS -q workq
### Specify the number of cpus for your job. This example will run on 32 cpus
### using 8 nodes with 4 processes per node.
#PBS -l nodes=8:ppn=4
# Switch to the working directory; by default PBS launches processes from your home directory.
# Jobs should only be run from /home, /project, or /work; PBS returns results via NFS.
echo Working directory is $PBS_O_WORKDIR
cd $PBS_O_WORKDIR
echo Running on host `hostname`
echo Time is `date`
echo Directory is `pwd`
echo This jobs runs on the following processors:
echo `cat $PBS_NODEFILE`
# Define number of processors
NPROCS=`wc -l < $PBS_NODEFILE`
echo This job has allocated $NPROCS cpus
# Run your job
./myprogram
# Alternatively, run a parallel MPI executable.
mpirun -v -machinefile $PBS_NODEFILE -np $NPROCS mympiprogram
See the Writing_a_PBS_Script page for more info.
Bulldog H also has the capability of interactive jobs. Interactive jobs allow you to "login" via pbs to a compute node. This is particularly useful to read log files or debug an application. To login interactively add the -I flag to qsub and exclude a script, for example:
qsub -I -l nodes=4:ppn=4 -q general
Will give you four nodes on the general queue.
See our Job Scheduler for more information.
Running MPI Jobs
Bulldog H has two mpi implementations installed on the cluster, openmpi and mpich2. mpich2 is considered the standard mpi implementation because of its speed and stability. In recent months, openmpi has grown in popularity considerably because of its ease of use, documentation, and support. If you're new to mpi I recommend reading the openmpi documentation even if you're interested in using mpich2.
To switch your mpi implementation edit your .bashrc file. In it there are four lines with the correct path for either compilers and either mpi implementations. You can verify your mpi version by running:
which mpicc
Each of the mpi paths are below:
export PATH=/usr/local/cluster/mpi/mpich-gcc/bin:$PATH
#export PATH=/usr/local/cluster/mpi/mpich-intel/bin:$PATH
#export PATH=/usr/local/cluster/mpi/openmpi-gcc/bin:$PATH
#export PATH=/usr/local/cluster/mpi/openmpi-intel/bin:$PATH
Of course after changing your .bashrc you will need to relogin or "source" the dot file by running:
source ~/.bashrc
MPICH2
We chose to standardize on mpich 2 compiled with gcc on bulldogh as the default message passing system. MPICH is generally considered the fastest, most stable, and programmer friendly of the mpi implementations. You may choose to use mpich with the gcc or intel compilers. To switch between mpi versions, edit your .bashrc file by commenting and uncommenting the appropriate fields. Using the mpi compiling binaries will generally link your application appropriately. If you need to link directly the lib and include directories are in /usr/local/cluster/mpi/mpich*
The entire mpich2 readme is in /usr/local/cluster/doco/mpich2-doc-README.txt. mpich2 uses mpd daemons to pass mpi messages. In each user's home directory a .mpd.conf file was created and populated with a random secretword string. Therefore you can run an mpi job using the simple script below:
#!/bin/bash #PBS -q queue_name #PBS -l nodes=2:ppn=4 MPD_CON_EXT=`date` NPROCS=`wc -l < $PBS_NODEFILE` NHOSTS=`cat $PBS_NODEFILE|uniq|wc -l` mpdboot --file=$PBS_NODEFILE -n $NHOSTS mpiexec -n $NPROCS my_job mpdallexit
This script uses the PBS_NODEFILE to calculate the number of processes to run based upon the -l pbs flag.
NOTE:
Please be sure that the variable MPD_CON_EXT=`date` is set in your pbs script before the mpdboot line. This changes the default MPICH2 behavior of associating all jobs by the same user to one mpd process per node. Adding this variable forces MPICH2 to launch a separate mpd per job. Without this variable a user with more than one job per node will have all their jobs fail once any job exits with the mpdallexit command.
openMPI
One of the advantages of openMPI over mpich2 is that it's pbs aware. The simplest pbs run script looks like:
#!/bin/bash #PBS -q queue_name #PBS -l nodes=2:ppn=4 mpirun my_job
In this case, the openmpi mpirun already knows which hosts and how many processes to run on.
For more information see http://www.open-mpi.org/faq/?category=running. openMPI has an array of options and performance tweaks. There are a ton of job specific variables that make mpi difficult. If you're having problems the openMPI email list is VERY active.
Troubleshooting
If you are running jobs and your memory usage is exceeding 7.5GB, the system may kill your job. If this happens, try to requesting an entire node for each job with the "-l nodes=1:ppn=4" flag in pbs. This flag will limit one job per 7.5GB of ram. If you are still having memory problems, you may need to use mpi programming to spread the execution over multiple hosts.
C++ and SEEK_SET Some users may get error messages such as
SEEK_SET is #defined but must not be for the C++ binding of MPI
The problem is that both stdio.h and the MPI C++ interface use SEEK_SET, SEEK_CUR, and SEEK_END. This is really a bug in the MPI-2 standard. You can try adding
#undef SEEK_SET #undef SEEK_END #undef SEEK_CUR
before mpi.h is included, or add the definition
-DMPICH_IGNORE_CXX_SEEK
to the command line (this will cause the MPI versions of SEEK_SET etc. to be skipped). http://www-unix.mcs.anl.gov/mpi/mpich/faq.htm#cxxseek
Working with Paths
Each user has his environment, including his path, setup in their .bashrc of their home directory. You can add or remove changes to your environment by adding or removing entries at the end of that file.
If I wanted to know my full path or any other environment variables issuing the command "env" will print your environment. Issuing "env|grep PATH" will show only the path. MPI and SSH
MPI uses ssh to launch the mpd daemons for mpich and openmpi. User accounts are created with passwordless ssh keys to allow the daemons to launch. If you altered the authorized_keys file in your .ssh directory mpi will not work. To test that this works run these commands from the login node:
qsub -q general -I
ssh c070 hostname
exit
The second command should print "c070" without any prompt.
Problems, Questions, Comments
Any problems, questions or comments should be sent to hpc@yale.edu This will enter the issue into our problem ticketing system.
Getting a queue setup
For information regarding purchasing computing time please email hpc@yale.edu.

