Batch example

Here is a simple batch script example:

#PBS -l nodes=128:intel     
#PBS -l walltime=24:00:00 
#PBS -A lqcd                       
#PBS -q ib                           
cd $PBS_O_WORKDIR          
# stagein files if needed      
./my_prog $MY_ARGS         
# stageout files if needed    
exit $EXIT_VAL

This is a sample minimal script. All of the #PBS values used above are required. #PBS -l nodes=128:intel asks the batch system for 128 nodes with the intel tag (nodes means cores, see below). Available tags and their meanings are documented in the next subsection. #PBS -l walltime=24:00:00 asks for 24 hours of walltime. #PBS -A lqcd says to use the account named lqcd -- you must change the -A option to your project name. The PI on your project will know the account name to charge to. #PBS -q ib specifies the name of the queue to use (ib). 

Please see the PBS documents to find out how to submit a job. Within your batch script, you must use appropriate commands to launch your job, and to stage data files in and out (below).

Notes on specifying number of nodes or cores

PBS at JLab is configured to schedule the number of cores, not the number of physical nodes.  I.e. the number of "nodes" that PBS knows about is actually the number of cores. For example:

         #PBS -l nodes=128

Will give you 128 CPU cores, regardless of the number of physical nodes that it takes to fulfill this request. At JLab this might be 16 8-core nodes, or 8 16-core nodes.

Choosing a specific machine type in a queue with more than one type of machine

To specify a specfic machine use one of the following tags inside of your submition script:

#PBS -l nodes=128:TAGTYPE

Where TAGTYPE is cores8 for 10q nodes or cores16 for 12s nodes. There are a number of tags for different GPU configurations. See the Node tags table section of this chapter for a full list of valid tags.

Environment variables available within your job

The batch system exports a few environment variables that are available within the job script. These include:

PBS_ENVIRONMENT This is either PBS_INTERACTIVE for interactive jobs (submitted with qsub -I) or PBS_BATCH (regular batch jobs). This is not a particularly useful variable, but it can be tested for existence to see if a script is running under the batch system control or not.

PBS_JOBID This is a useful variable in that it is unique for the job. The value is the same that is reported by qstat. This variable is in the form of XXX.server, where XXX is a sequence digit which increases over time (it may reset, but this is very rare), and server is the name of the PBS batch server.

PBS_NODEFILE This variable points to a file that has all of the hostnames of the nodes that the job is using. When jobs use more than one CPU core, the hostnames are repeated for each CPU core. This variable is very useful. For example, one can determine how many CPU cores a job is using by doing wc -l < $PBS_NODEFILE. To get a list of all of the hosts within a job, one can do uniq < $PBS_NODEFILE.

PBS_JOBNAME This is the name of the job. This can be set by the user with qsub -N JOBNAME or within a script by adding #PBS -N JOBNAME. If no jobname is explicity set with -N, the jobname is the same as the script filename that is used for a job. If no script is used, then STDIN is the default name that is assigned to the job.

PBS_O_QUEUE This is the queue name for the job.

PBS_O_WORKDIR This is the directory from which the job was submitted. This is a very useful variable. This is where the output from the job is delivered, and its often used as a point of reference for all of the job's files. Note: It is best to submit jobs from /home/somewhere and NOT from /scratch or /work or /cache