You are here

Text Command File

Keywords

All keyword fields are specified as strings. Where a field can contain a list, the list elements should normally be separated by spaces (or new lines), several elements may appear on a single line and the list may continue over several lines. There are only three mandatory keywords, all others are optional and default (where appropriate) to something reasonable. Blank lines are ignored, lines with '#' in column 1 are treated as comments (and ignored).



Keywords Default                                 Function
Mandatory
PROJECT
 
none
Project to which the time should be accounted. See Valid Projects for a list of allowed projects.
TRACK
 
none
See Batch Job Tracks for information on allowed tracks.
COMMAND
 
none
The command to be run. Can be a system command (like 'ls') or a user script. For user commands specify the full path name.
Optional
OS
centos77
Which type OS the jobs should execute on, centos77 or general (for any OS variant).
NODE_TAG
  none
Which type of machine
the jobs should execute on:
farm14, farm16 or farm18, farm19
COMMAND_COPY
 
none
If present - copy the file specified by the COMMAND keyword to the local disk. Useful if the command is the executable rather than a script.
JOBNAME
  none
Name for the jobs. Used only as a label. Can be any string, but must not include spaces. Name must be 50 chars or less and must start with letter.
MAIL
 
none
List of e-mail addresses to which to send results. The first addressee will receive mails from each individual job running under LSF as well as a summary mail from the server, the remaining addressees will only receive the summary mail. If the keyword is missing (or contains no addresses) the submitting user will receive all mails.
TIME
   1440
Time limit (minutes) for each individual job. The default, is 1440 minutes (24 hours), the maximum value allowed for use with the non debug/test/theory
track is 72 hours.
OPTIONS
 
none
Command line options associated with the command to be run.
INPUT_FILES
 
none
A list of files to be processed. The full path names of the files should be given. The elements of the list should be separated with spaces, they may be all on one line or run over several lines. Each file will result in a job on the farm. The file will be copied to the local disk for processing. Please see This note for details on how the farm can automatically cache the input files for your jobs by specifying "/mss" stub paths as the input.
SINGLE_JOB
false 
Specify this keyword (no parameters) to force a single job to process all the input files. The default is to process each input file in a separate job.
   See note2 below
MULTI_JOBS
     1 If only 1 or no input files are given, run the job this many times. If the input file list contains 2 or more files this variable is ignored.
    See note2 below 
OTHER_FILES
 
none
Any other files that (for efficiency) should be copied to the farm node. These files will all be copied to all nodes. This may include an executable program that is for instance run by a user script given in the 'COMMAND' keyword.
INPUT_DATA


The name of the input file on the farm node. Each file given in the 'INPUT_FILES' list will be copied to this name on the farm. The user program should take it's input from this filename. It is local to the farm and no pathname should be given.

If this keyword is not given then each input file given in the "INPUT_FILES" list will be copied to the local disk on the farm as itself.
    See note1 below

OUTPUT_DATA
 
none
The name of the output data file generated by the program (this is a local file on the farm - no pathname is needed). This key should be given with the OUTPUT_TEMPLATE key in pairs.
   See note3 below

If the filename contains a wildcard character ( * ) then all matching files will be copied. Only one OUTPUT_TEMPLATE should be given and that must be in the form "/directory/path/@OUTPUT_DATA@" 

   See note3 below 

OUTPUT_TEMPLATE
 
none
The template filename for each output file. Each file with the name given in 'OUTPUT_DATA' will be copied to a file with this name with the file extension given by the input file. If output is going to tape this template should be for the OSM stub files. If the filename part of the template is "@OUTPUT_DATA@" then the created file will have the same name as the output file on the farm node. If the filename part of the template is "*" or "@INPUT_DATA@" then the created file will have the same name as the input file. See the examples for clarity.
    See note3 below
CPU
  1
The number of CPU cores a job needs.
DISK_SPACE
  4 GB
The amount of disk space that your job will require. Auger will ensure that the machine your job runs on will have at least the specified amount of disk space available. The disk space value you specify must be an integer and must have a unit (MB or GB) after the number with a space in between. (e.g. 15 GB).
MEMORY
512 MB
The amount of memory that your job will require. Auger will ensure that the machine your job runs on will have at least the specified amount of memory available. The memory value you specify must be an integer and must have a unit (MB) after the number with a space in between.

Note:
1. Use of the INPUT_DATA keyword:

In order to copy a data file called /work/experiment/raw/run001.data to the local disk on the farm specify:

         INPUT_FILES: /work/experiment/raw/run001.data

and copy a data file called /work/experiment/raw/run001.data to the local disk on the farm as a different name, specify:


         INPUT_FILES: /work/experiment/raw/run001.data

         INPUT_DATA: fort.11

and locally the file will be named fort.11. This is useful if you want
to generate many jobs from a list of input files. One job will be
generated for each input file and run on a different farm machine, but
all jobs will expect to read data from a file called "fort.11" (or
whatever you care to call it).


2. Single vs. multiple jobs.
By default the system will generate one job for each data file you
specify with the INPUT_FILES keyword. Use the following to change this
behaviour:

  • SINGLE_JOB

Will force all the input files specified to be processed in one
single job. All the specified files will be copied to the farm machine
local disk before the job starts executing. Remember that there is a
limit of 4 GB disk space per job. It probably does not make sense to
specify the INPUT_DATA keyword in this case (see note above).


  • MULTI_JOBS



If a single file is (or no files are) specified as INPUT_FILES,
run the job this many times - one job per machine. This may be useful
for simulations running from the same input file of generated events for
example (up to you to make sure that the random number seeds ar
different...), or testing.

3. Output files


The keywords OUTPUT_DATA and OUTPUT_TEMPLATE can be used in pairs
to control the disposition of output files. Each OUTPUT_DATA keyword
given should have a matching OUTPUT_TEMPLATE. The OUTPUT_DATA is
the name of the output file locally on the farm machine disk. The
corresponding OUTPUT_TEMPLATE directs the copying of the output file
to a mass storage device (the silo or a work area). There are several
points to note:

  • Using /mss as a prefix to the OUTPUT_TEMPLATE will automatically cause the file to be copied to the tape silo using jput.
  • If the template contains the character * then that character will be replaced by the name of the input file being processed.
  • If the template contains the string @INPUT_DATA@ will behave
    in the same way as the * character - the string being replaced by the
    input file name.
  • If the template contains the string @OUTPUT_DATA@, that
    string will be replaced by the local name of the output file - i.e. the
    name specified with the keyword OUTPUT_DATA