Data Management FAQ

What is the best/fastest way to move files from /volatile to /cache?

/volatile and /cache are actually remounts of /lustre/volatile and /lustre/cache. If you use the mv command to move files between the two and if you fully qualify the files, i.e., use mv /lustre/volatile/PROJECT/src/files /luste/cache/PROJECT/dest/files, then the move will be instantaneous. Doing mv /volatile/PROJECT/src/files /cache/PROJECT/dest/files will require a copy of all of the source files vs just a rename.

When will a file under /cache be backed up?

By default, files in /cache and 12 days old will be automatically backed up. Files with size larger than 1MB will be migrated to the tape library. A user can use srmPut to flush files to tape at anytime.

Why is it a good practice to tar small files?

Although /cache is designed to store large data files, a user can create a file with any size. However, it is recommend to pack small files into a large file if there are more than 100 files in a group. To put/get many many small files to/from the tape system might take many days, whereas a single tar file could be written or retrieved in a small fraction of that time.


Can a file/directory be renamed or moved after it is backed up?

No, a user shouldn't rename or move a file after it is put onto tape. There is a one to one mapping between the file path in cache and that in the tape library. Once a file is put into the tape library, the content, location and name should not be altered. If a file is moved/renamed, it will be treated as a new file and will be backed up again. This will not only waste tape, it will also create confusion for future users who may want to use the data. Please send a CCPR if you want to rename a cache directory after the files is backed up to tape system.

Why shouldn't a user put a personal name in a directory path under /cache/project?

The /cache/project area is designed to store project related data. The data files usually are grouped and stored according to meta data parameters, and shared by multiple users. It is a bad idea to name any directory using personal information, including user name. However, please don't change any existing directory name even it is named incorrectly (see above). If working outside of existing projects, please request an allocation for a new small project.

Why does a file disappear from /cache disk?

The /cache disk is managed by cacheManager, a daemon which backs up files and cleans up the disk for new data files. An old file which is already on tape and which is not marked in use can be deleted whenever disk space is needed.

How can I mark a file in use to prevent it from being deleted?

The CacheManager software provides a utility srmPin to mark a file as in use, so it will not be deleted from disk. Please reference the cache manager utilities page for usage of this utility. Use the [-t] option to specify the lifetime; default lifetime is 30 days.

How can I delete a file from disk but ensure it is first backed up?

You can use any Unix command to delete files under /cache. It is recommend to use "srmPut -d path" for a permanent file to ensure it is backed up before it is removed from disk. srmPut command will send a delete request to the cacheManager daemon, but will compare the content of the file in cache with the copy on tape before deleting it (by comparing MD5 checksums). In this way, it is guaranteed that the file has an archive copy.