Cache manager policy

/cache Backup policy:
- Files will be backed up to tape if their size is between 3 MB to 300 GB, and the file is older than the backup threshold (12 days).
- For files between 1MB to 3MB, user can call srmPut to backup them to tape library if needed. But srmPut will fail if file size is less than 1MB.

/cache Deletion policy:

- Any files that larger than 10 TB will be deleted from /cache file system, and a email will send to <user>@jlab.org.

FAQ

Interactive Access

The LQCD  / HPC systems have multiple interactive nodes configured with different hardware, and are best accessed via appropriate alias names:

File Systems

Each project can request a quota on each of the shared resources, and the path for that project is based upon the project name:

Allocations & fair share

Allocations

Batch example

Here is a simple batch script example:

Node tags table

Hint Use tags sparingly, as they tend to restrict the set of nodes eligible to run your job. You can view a node's tags by doing pbsnodes -a <nodename>.

For the phi queue, there are several useful tags for choosing nodes, depending on your use of the fast MCDRAM:

Batch System

Submitting Batch Jobs
You can submit jobs from one of the interactive nodes or from within a running batch script. Batch jobs are submitted using standard qsub commands with a valid project account to one of the following sets of production or debug queues.  You can specify options on the command line, or (recommended) put all of them into your batch script file.  See examples below.  There is one PBS server is running on qcdpbs, and PBS commands default to using that server.

Network certificate

Network Authentication
Auger (the batch system) and Jasmine (the mass storage manager) both require Jefferson Lab certificates for authentication. The certificate identifies the user to the Auger (Jasmine) server in a cryptographically secure manner without requiring the user to enter their password. This allows users to create scripts which submit jobs for them while still providing the security required to ensure that users can only submit jobs for themselves. Users generate their certificate by executing the following command:

Batch system basics

Auger
Users interact with the batch system through a layer (Auger) that provides additional useful features above the basic functions of the slurm batch system. 

 

Pages