To submit a job to the farm you need to create a configuration script that describes your job(s). There are two different formats for the configuration file. The first is a flat file format (TXT) which has been supported for years. The second format is an XML based format, which allows some additional description.
All keyword fields are specified as strings. Where a field can contain a list, the list elements should normally be separated by spaces (or new lines), several elements may appear on a single line and the list may continue over several lines. There are only three mandatory keywords, all others are optional and default (where appropriate) to something reasonable. Blank lines are ignored, lines with '#' in column 1 are treated as comments (and ignored).
Keywords | Default | Function |
---|---|---|
Mandatory | ||
PROJECT | none | Project to which the time should be accounted. See Valid Projects for a list of allowed projects. |
TRACK | none | See Batch Job Tracks for information on allowed tracks. |
COMMAND | none | The command to be run. Can be a system command (like 'ls') or a user script. For user commands specify the FULL path name. |
Optional | ||
OS | centos77 | Which type OS the jobs should execute on, centos77 or general (for any OS variant). |
NODE_TAG | none | Which type of machine the jobs should execute on: farm14, farm16 or farm18, farm19 |
COMMAND_COPY | none | If present - copy the file specified by the COMMAND keyword to the local disk. Useful if the command is the executable rather than a script. |
JOBNAME | none | Name for the jobs. Used only as a label. Can be any string, but must not include spaces. Name must be 50 chars or less and must start with letter. |
none | List of e-mail addresses to which to send results. The first addressee will receive mails from each individual job running under LSF as well as a summary mail from the server, the remaining addressees will only receive the summary mail. If the keyword is missing (or contains no addresses) the submitting user will receive all mails. | |
TIME | 1440 | Time limit (minutes) for each individual job. The default, is 1440 minutes (24 hours), the maximum value allowed for use with the non debug/test/theory track is 72 hours. |
OPTIONS | none | Command line options associated with the command to be run. |
INPUT_FILES | none | A list of files to be processed. The full path names of the files should be given. The elements of the list should be separated with spaces, they may be all on one line or run over several lines. Each file will result in a job on the farm. The file will be copied to the local disk for processing. Please see This note for details on how the farm can automatically cache the input files for your jobs by specifying "/mss" stub paths as the input. |
SINGLE_JOB | false | Specify this keyword (no parameters) to force a single job to process all the input files. The default is to process each input file in a separate job. See note2 below |
MULTI_JOBS | 1 | If only 1 or no input files are given, run the job this many times. If the input file list contains 2 or more files this variable is ignored. See note2 below |
OTHER_FILES | none | Any other files that (for efficiency) should be copied to the farm node. These files will all be copied to all nodes. This may include an executable program that is for instance run by a user script given in the 'COMMAND' keyword. |
INPUT_DATA |
The name of the input file on the farm node. Each file given in the 'INPUT_FILES' list will be copied to this name on the farm. The user program should take it's input from this filename. It is local to the farm and no pathname should be given. If this keyword is not given then each input file given in the "INPUT_FILES" list will be copied to the local disk on the farm as itself. |
|
OUTPUT_DATA | none |
The name of the output data file generated by the program (this is a local file on the farm - no pathname is needed). This key should be given with the OUTPUT_TEMPLATE key in pairs. If the filename contains a wildcard character ( * ) then all matching files will be copied. Only one OUTPUT_TEMPLATE should be given and that must be in the form "/directory/path/@OUTPUT_DATA@" See note3 below |
OUTPUT_TEMPLATE | none | The template filename for each output file. Each file with the name given in 'OUTPUT_DATA' will be copied to a file with this name with the file extension given by the input file. If output is going to tape this template should be for the OSM stub files. If the filename part of the template is "@OUTPUT_DATA@" then the created file will have the same name as the output file on the farm node. If the filename part of the template is "*" or "@INPUT_DATA@" then the created file will have the same name as the input file. See the examples for clarity. See note3 below |
CPU | 1 | The number of CPU cores a job needs. |
DISK_SPACE | 4 GB | The amount of disk space that your job will require. Auger will ensure that the machine your job runs on will have at least the specified amount of disk space available. The disk space value you specify must be an integer and must have a unit (MB or GB) after the number with a space in between. (e.g. 15 GB). |
MEMORY | 512 MB | The amount of memory that your job will require. Auger will ensure that the machine your job runs on will have at least the specified amount of memory available. The memory value you specify must be an integer and must have a unit (MB) after the number with a space in between. |
1. Use of the INPUT_DATA keyword.
In order to copy a data file called /work/experiment/raw/run001.data to the local disk on the farm specify:
INPUT_FILES: /work/experiment/raw/run001.data
To copy a data file called /work/experiment/raw/run001.data to the local disk on the farm as a different name called fort.11, specify:
INPUT_FILES: /work/experiment/raw/run001.data
INPUT_DATA: fort.11
This is useful if you want to generate many jobs from a list of input files. One job will be generated for each input file and run on a different farm machine, but all jobs will expect to read data from a file called fort.11 (or whatever you care to call it).
2. Single vs. multiple jobs.
By default the system will generate one job for each data file you specify with the INPUT_FILES keyword. Use the following to change this behaviour:
3. Output files.
The keywords OUTPUT_DATA and OUTPUT_TEMPLATE can be used in pairs to control the disposition of output files. Each OUTPUT_DATA keyword given should have a matching OUTPUT_TEMPLATE. The OUTPUT_DATA is the name of the output file locally on the farm machine disk. The corresponding OUTPUT_TEMPLATE directs the copying of the output file to a mass storage device (the silo or a work area). There are several points to note:
All jobs submitted to the farm will belong to a track. The track is used to determine the queue a job runs in and the type of job for accounting purposes.
Track | Batch Queue | Description |
---|---|---|
debug | priority | For debug use |
reconstruction | production | Reconstruction of raw data |
analysis | production | Analysisjobs |
one_pass | production | Combined reconstruction/analysis jobs |
simulation | production | Simulation jobs |
test | priority | Test run of code on the farm |
theory | --- | Running long theory jobs |
Note that there are two special tracks: debug and test. The debug track is useful when you are trying to debug your jobs to make sure that they run correctly. They get the highest priority when running on the farm, but are limited to 4 hours of CPU time. The test track is at the opposite end of the spectrum and is used for jobs that don't have to finish any time soon. Not only do jobs in the test track get the lowest priority for being dispatched to the farm, but even after being dispatched these jobs will be suspended if the farm gets busy.
This document describes the specification for the XML based job description language (JDL) that Auger uses.
The JDL is XML based. The top level tag will be a request tag. The submission will consist of a description of one or more jobs that are to be run on the farm. The valid subtags and tag attributes will be described below. The script can be augmented with the use of variables. Comments can be specified by using the XML comments by starting the comments with Tags.
There are two tags that are special for describing a job. These are <Request>, and <Job>. The Request tag is the top level tag and is required. Each Job tag corresponds to an individual Farm job. There must be at least one Job inside the Request element. The remainder of the tags specifies properties about the jobs. Note all tags are case sensitive.
Request - Every submission script must consist of exactly one <Request> tag which is the top level tag. The Request tag must not contain any attributes. All of the properties that describe a submission are specified in subtags. Note that many properties can be specified at either the Request level or the Job level. Properties that are set at the Request level apply to every Job. This allows you to set global properties for each job, rather than having to re-specify them for each individual job
Job - A job tag corresponds to an individual job for the Farm. This tag takes no attributes, but rather gets all of its values from properties specified with in tags at the Request level, or tag inside this Job tag. There may be any number of these within a request, each one will result in a job on the farm.
ForEach - The <ForEach> tag is used to create multiple jobs. It acts as a looping mechanism. It has one required attribute, list. The value of this attribute must be the name of a list that was defined using a <List> tag. The ForEach tag is equivalent to having the body of the ForEach repeated once for every element that is in the list, and the element of the list will be stored in a variable with the name of the list. There must be a <Job> tag inside of the ForEach tag (otherwise the ForEach doesnt do anything). ForEach tags can be nested, if desired. In this case, you only have to have a <Job> tag inside the innermost ForEach tag.
Required Properties - There are certain properties that must be set for a job to be valid. These properties can either be set at the Job level, or at the Request level. Valid tags are <Command>, <Name>, <Project> and <Track>.
Tag Type | Valid Tags | Example |
---|---|---|
Request | <Request> <Command> ...... </Command> <DiskSpace space="xxx" unit="MB/GB" /> <Memory space="xxx" unit="MB/GB" /> <OS name="xxx" /> email="xxx@jlab.org" request="false/true" job="true/false" /> <Input src="xxx" dest="xxx" /> <Output src="xxx" dest="xxx" /> <Job> ....... </Job> <ForEach list="xxx"> ...... </ForEach> <List name="xxx"> ......</List> <Name name="xxx" /> <Project name="xxx" /> <Track name="xxx" /> <Stderr dest="xxx" /> ;<Stdout dest="xxx" /> <TimeLimit time="xxx" unit="minutes/hours" /> <Variable name="xxx" value="xxx" /> |
<Request> <!-- Job descriptions go here --> </Request> |
Job | <Command> <DiskSpace> <Input> <Name> <Output> <Track> <Stderr> <Stdout> <TimeLimit> <Variable> |
<Request> <!-- Properties Specified Here --> <Job> <!-- Properties overridden here --> </Job> </Request> |
ForEach | <Command> <DiskSpace> <ForEach> <Input><Job> <Name> <Output> <Stderr> <Stdout> <TimeLimit> <Variable> |
<Request> <!-- Properties Specified Here --> <List name="words">hello world abc 123 more words</List> <ForEach list="words"> <Job> <!-- Properties overridden here --> <Command>echo ${words}</Command> </Job> </ForEach> </Request> |