Storage on HPC#
HOME directory#
When you login to ARC3 or ARC4 you automatically start in your HOME directory. Your HOME directory is backed up weekly and is shared across both HPC systems. Your HOME directory has a quota of 10GB which is not a lot of space. Exceeding your quota can result in errors and job failures.
Check HOME directory usage with the quota
command:
$ quota -s
Disk quotas for user exuser:
Filesystem space quota limit grace files quota limit grace
nas-ufaservn1:/export/home/home01
7897M 10240M 11264M 25194 0 0
The -s
argument provides a more human readable output. The example output shows that exuser has used 7897MB (7.9GB) of space, the quota is 10240MB (10GB), the hard limit is 11264MB (11GB) and there are 25194 files.
/nobackup#
Each HPC system has a different /nobackup directory. /nobackup is constructed using the Lustre parallel filesystem. ARC3 has ~836TB at 4GB/s and ARC4 has ~1.2PB at 11GB/s. ARC3 has 3191616896 (3191 Million) Inodes, ARC4 has 936379680 (936 Million) Inodes. Each file and directory requires an Inode.
Only a small fraction of the storage capacity and Inodes is generally available for any job/user.
Some commonly used input data used by groups of users are stored on /nobackup.
Users should only store data needed for current projects and processing on /nobackup. If you estimate and intend to use more than 1 TerraByte of filespace on /nobackup on ARC3 or ARC4 and/or use more than 1 Million Inodes, please liaise with the RSE team and your supervisor before staging your jobs (data tranfer and submitting to the scheduler). In general, please liaise with your supervisor about your HPC data management, processing and transfer.
Whilst the file system works more efficiently with fewer numbers of files and with plenty of capacity (more then 30% available), it can also be costly to have a lot of data transfer to/from /nobackup.
On the HPC systems, processing workflows are more efficient if they use fewer numbers of larger files rather than larger numbers of smaller files. Creating, storing and reading large numbers of files on /nobackup can be problematic, so you are encouraged to develop processing workflows that only create, store and read small numbers of files.
In general, it is best to scale up gradually to process larger amounts of data and to benchmark (record how long things take and how much resource they require as you scale up). Understanding how much data storage and how many files you will be creating can be as important as understanding how much memory and processing time is required for different amounts of processing cores/nodes.
You can check your quota on /nobackup use the following lfs quota
command:
$ lfs quota -h /nobackup
Warning
/nobackup is **not** backed up and files are removed if not accessed after 90 days.
To ensure sufficient space and bandwidth is available, files on /nobackup are purged periodically.
Accessing /nobackup#
To access the /nobackup use the cd
command:
$ cd /nobackup
Users are encouraged to keep files on /nobackup in a directory that has the same name as your USERNAME. If your USERNAME were exuser, the following would create the directory and change into the directory:
$ mkdir /nobackup/exuser
$ cd /nobackup/exuser
/resstore#
Another location where data is stored for processing on HPC systems is /resstore. If you need to use data stored in /resstore then you will need adding to a specific group.