Storage Overview#

Aire offers versatile storage solutions to support diverse research workflows. This guide explains the available storage options, their key features, and best practices for efficient data and quota management. Use the information below to make informed decisions and optimize your HPC work.

Summary of Storage Types#

The table below provides a high‑level comparison of each storage option. Note that the associated environment variables (e.g., $HOME, $SCRATCH) simplify navigation in your workflows by automatically pointing to the correct directories.

Storage Type

Details

Home Folder

Path: /users/<username>
Env Variable: $HOME
Quota: 30GB, 1 million files
Backup: ✅ Yes
Automatic Deletion: ❌ No
Best For: Persistent small files (scripts, notes, configs)

Scratch on Lustre (Disk‑based)

Path: /mnt/scratch/<username>
Env Variable: $SCRATCH
Quota: 1TB, 1.5 million files
Backup: ❌ No
Automatic Deletion: ❌ No
Best For: Large datasets

Flash on Lustre (NVMe‑based)

Path: /mnt/flash/tmp/job.<JOB-ID>
Env Variable: $TMP_SHARED
Quota: 1TB, 1.5M files
Backup: ❌ No
Automatic Deletion: ✅ Yes
Best For: I/O‑intensive tasks

Scratch on Computing Nodes

Path: /tmp/job.JOB-ID
Env Variable: $TMP_LOCAL, $TMPDIR
Quota: None, subject to node storage availability
Backup: ❌ No
Automatic Deletion: ✅ Yes
Best For: Single‑node jobs needing fast, localized storage

Key Information

  • Temporary Data: Data in TMP_SHARED, TMP_DIR, and TMP_LOCAL is automatically deleted when a job completes.

  • No Backups: Data in SCRATCH, TMP_SHARED, TMP_DIR, and TMP_LOCAL is not backed up. Archive critical files to your Home Folder or external storage.


Best Practices for Storage Management#

Follow these guidelines to help you efficiently manage your data on Aire’s HPC system:

  1. Choose the Right Storage for Your Task:

    • Large, intermediate datasets should be managed on Scratch on Lustre.

    • For single‑node, low‑latency tasks, use TMP_LOCAL.

    • Use TMP_SHARED or Flash on Lustre for I/O‑intensive operations needing temporary speed.

  2. Clean Up After Jobs (Where Necessary):

    • Scratch on Lustre requires manual cleanup. Remove unneeded files to free up space.

  3. Back Up Critical Files:

    • Always archive important data externally.

    • Since temporary storage is not backed up, ensure critical results are saved safely.

  4. Organize and Optimize File Usage:

    • Structure directories logically to facilitate efficient data retrieval.

    • Where possible, aggregate many small files into archives (e.g., using tar) to improve performance on Lustre systems.

  5. Monitor Your Storage Usage:

    • Home quota: quota -s

    • Scratch quota: lfs quota -h -u $USER /scratch

    • Flash quota: lfs quota -h -u $USER /flash

    • Proactive storage management helps prevent interruptions during critical work.


Detailed Storage Descriptions#

Home Folder#

  • Path & Environment:

    • Directory: /users/<username>

    • Accessible via the $HOME variable and via the ~ shortcut.

  • Quota: 30GB and up to 1 million files.

  • Backup: Yes (with periodic backups – external archiving recommended for critical data).

  • Automatic Deletion: No.

  • Usage:
    Appropriate for persistent, small files such as scripts, documentation, and configuration files. Not appropriate for high I/O operations.


Scratch on Lustre (Disk‑based)#

  • Path & Environment:

    • Directory: /mnt/scratch/<username>

    • Accessible via the $SCRATCH variable.

    • Symlink: /scratch -> /mnt/scratch

  • Quota: 1TB and up to 1.5 million files.

  • Backup: No.

  • Automatic Deletion: No.

  • Usage:
    Designed for large datasets and active job data. Manual cleanup is essential to avoid exceeding quotas.


Flash on Lustre (NVMe‑based)#

  • Path & Environment:

    • Directory: /mnt/flash/tmp/job.<JOB-ID>

    • Accessible via the $TMP_SHARED variable.

    • Symlink: /flash -> /mnt/flash

  • Quota: 1TB per job and up to 1.5 million files per job.

  • Backup: No.

  • Automatic Deletion: Yes—files are purged upon job completion.

  • Usage:
    Optimized for I/O‑intensive operations such as simulations. Ideal for tasks that require high performance during the job period.


Scratch on Computing Nodes#

  • Path & Environment:

    • Directory: /tmp

    • Accessible via $TMP_LOCAL and $TMPDIR.

  • Quota: None, subject to node storage availability

  • Backup: No.

  • Automatic Deletion: Yes—data is purged after job completion.

  • Usage:
    Best for fast, node‑local storage during single‑node jobs. Note that data cannot be shared between nodes and is local.