You are here

/work from within a batch job

Its recommended that you use the script cache_cp to copy files from /work
into and out of the compute nodes used by your jobs. At this writing,
rcp is symlinked to cache_cp. To use cache_cp, you use it as if you
were using rcp, except that you do not need to provide the correct
server name that is the source or destination under /work. The script will
fill in the appropriate server for you.

cache_cp overview

cache_cp, the client program, reads a configuration file /etc/cache_cp.conf, which
contains the port for the cache_cpd daemon, and a series of mappings like:

#               hpcdata7
/cache/LHPC                             hpcdata7:/cache7/LHPC
/cache/casa                             hpcdata7:/cache7/casa

#               hpcdata8
/work/HASTE                             hpcdata8:/work/HASTE
/work/JLabLQCD                          hpcdata8:/work/JLabLQCD

The cache_cp program then translates the maps on the left hand side to the value on the right hand side. It also cleans up /w/work to be just /work,
etc. If the files are to be put onto a data server, the file sizes are
summed up, and sent along to the server. If the files are to be
retrieved from the data server, the server sums up the file sizes.

These sums are used to determine if the transfer is "large" or
"small". For simplicity, all recursive copies (those that have an
argument of -r on the commandline) are considered large. Once a
transfer is classified to be large or small, they are put into separate
queues on the server. Each of these queues have limits on the number of
concurrent copies.

A PUT or a GET request is sent to the server, and the server
sends back a "TICKET", which is the md5checksum of the request. The
client then loops asking the server if the "TICKET" is next in line. If
it is, then the server gives an "OK". If not, the server tells the
client to sleep for a period of time.

Once the client gets the "OK", the rcp starts. During the
transfer, the client updates the server that the transfer is in
progress. Once the transfer is done, the client sends a "DONE" to the

The server has a supervisor thread that ensures that the
connection states are correct, and it will purge any requests that have
been idle for over 2 minutes.