You are here

Data Management FAQ

What is the best/fastest way to move files from /volatile to /cache?

/volatile and /cache are actually remounts of /lustre/volatile and /lustre/cache. If you use the mv command to move files between the two and if you fully qualify the files, i.e., use mv /lustre/volatile/PROJECT/src/files /luste/cache/PROJECT/dest/files, then the move will be instantaneous. Doing mv /volatile/PROJECT/src/files /cache/PROJECT/dest/files will require a copy of all of the source files vs just a rename.

When will a file under /cache be backed up?

By default, files in /cache reach to 12 days and the size in proper range will be automatically backed up to tape library. A user can use srmPut to flush files to tape at anytime.  Check Cache Manager Policy page for updated detailed information.

Why is it a good practice to tar small files?

Although /cache is designed to store large data files, a user can
create a file with any size. However, it is recommend to pack small
files into a large file if there are more than 100 files in a group. To
put/get many many small files to/from the tape system might take many
days, whereas a single tar file could be written or retrieved in a small
fraction of that time.

Can a file/directory be renamed or moved after it is backed up?

No, a user shouldn't rename or move a file after it is put onto tape.
There is a one to one mapping between the file path in cache and that in
the tape library. Once a file is put into the tape library, the
content, location and name should not be altered. If a file is
moved/renamed, it will be treated as a new file and will be backed up
again. This will not only waste tape, it will also create confusion for
future users who may want to use the data. Please send a CCPR if you want to rename a cache directory after the files is backed up to tape system.

Why shouldn't a user put a personal name in a directory path under /cache/project?

The /cache/project area is designed to store project related
The data files usually are grouped and stored according to meta data
parameters, and shared by multiple users. It is a bad idea to name any
directory using personal information, including user name. However,
please don't change any existing directory name even it is named
incorrectly (see above). If working outside of
existing projects, please request an allocation for a new small project.

Why does a file disappear from /cache disk?

The /cache disk is managed by cacheManager, a daemon which backs up files and cleans up the disk
for new data files. An old file which is already on tape and which is
not marked in use can be deleted whenever disk space is needed.

How can I mark a file in use to prevent it from being deleted?

The CacheManager software provides a utility srmPin to mark a file as in
use, so it will not be deleted from disk. Please reference the cache manager utilities page for usage of this utility. Use the [-t] option to specify the lifetime; default lifetime is 30 days.

How can I delete a file from disk but ensure it is first backed up?

You can use any Unix command to delete files under /cache. It is
recommend to use "srmPut -d path" for a permanent file to ensure it is
backed up before it is removed from disk. srmPut command will send a
delete request to the cacheManager daemon, but will compare the content
of the file in cache with the copy on tape before deleting it (by comparing MD5 checksums). In this
way, it is guaranteed that the file has an archive copy.