You are here

FAQ

How does Auger handle different types of input file?

There are two different ways that a file is staged to a farm node. 

  1. If an input file is from tape library (/mss/xxx... or <Input src="mss:/mss/xxx..."), Auger will jcache the file first, then makes a link in  WORKING DIR to the file in /cache/mss/. In this case, the input file will not be counted as the disk allocation of the job.
  2. If an input file is on /home, /volatile, or /work, Auger will copy it to WORKING DIR on the local disk of a node. In this case the input file does count as part of disk usage of the job. Please request enough disk space (to hold input and output files) when submitting a job.

What is the best practice to submit large number of IO-intensive jobs?

If a farm job reads/writes input/output files intensively, it is recommended to access the files stored on the local disk of a node. If an input file is from /home, /work or /volatile, please use jsub INPUT tag so that Auger will copy the file to the farm node. But if the input file is from the tape system and the job will access this file repeatedly, it is recommended to copy the file to the assigned farm node. One can easily accomplish this by adding these two lines of code at the beginning of the job: rm file-name; cp /cache/mss/.../file-name.  Otherwise, the job will read and write to and from this file directly using Lustre. If the I/O is small but intensive, the I/O operations will slowdown the Lustre file system dramatically.  This is especially important for running lots of copies of one type of job since the local disks in aggregate have 5 times the bandwidth and performance as the entire Lustre system.

Use multi-threaded code rather than multiple single threaded jobs to efficiently use available memory, and allows jobs with a footprint as large as the entire node's memory (currently the largest is 32 GB).

How do I submit a multi-threaded job ?

Using CPU: X tag or <CPU core="X"/> to request X number of cores in a job. See Auger Examples for more information.

How do I run a java program in a farm job?

Q: When I use the default java (64bit) to run a java program, it works on ifarm but fails in a farm job with this error:

    Error occurred during initialization of VM    
Could not reserve enough space for object heap

A: The 64bit java allocates a lot of virtual memory (>10GB) for its java VM. A batch job has a small memory size limit because of running inside a batch system PBS. Thus, please use the 32bit java under /apps with -Xmx option to specify a maximum heap size when submitting a java job. For example, to specify 512MB:

 /apps/scicomp/java/jdk1.7/bin/java -Xmx512m -version 
/apps/scicomp/java/jdk1.8/bin/java -Xmx512m -version

In addition, please add "MEMORY xxx MB" tag in your jsub script to request enough memory (200 MB more than VM maximum heap size requested by using -Xmx option).

Alternatively, use multi-threading and you can then run a large java virtual machine.