Data on HPC

Warning If you generate valuable data, it is your responsibility to back it up somewhere else other than on HPC.
Warning You will want to have a good understanding of how much data your job will be producing before embarking on a large run. You can easily impact and even crash a data file-server by running a large job that spits out huge amount of data from several compute nodes. If in dought, check with HPC support.

Public & Private Data Storage:

The HPC Cluster has two main types of Data Storage.

  • Public and

  • Private

All data Raid servers on HPC are configured with either Raid5 or Raid6 (most being Raid6) for data redundancy in case of a disk failure so that data will NOT be lost.

Raid levels are important and we used them on HPC but they do NOT protect us from other catastrophic and unplanned events such as a Raid controller gone bad, server issues, heat damage from A/C loss, fire, etc. So to repeat you will want to keep your own backups.

Public Data Storage:

There are five public data storage file systems that are available to all HPC users:

  1. /data/users/$USER - This is your home directory.

  2. /pub/$USER - Main public user data work location.

  3. /fast-scratch/$USER - Very fast disk access for temporary use.

  4. /ssd-scratch/$USER - Scratch filesystem made up of SSD disks.

  5. /scratch - Local and limited disk space on each compute node for temporary use.

Public ( Home Directory ):

Your home-directory is located at /data/users/$USER and it is meant for very light disk I/O and it has a very small disk quota to keep users from doing any type of serious work from it.

For any type of serious work, you will need to use the public data server /pub/$USER or use private data storage if you have access to it.

Warning Do NOT use your home-directory for any serious work!

Public Data Server ( /pub ):

The public data server has a larger disk quota and it is meant to be used for moderate to heavy data I/O work. If you have access to private disk space, use them instead please as private servers are faster.

Your public data location is:

  • /pub/$USER

The public data server uses the XFS file-system mounted over NFS on the Infiniband network.

Public Fast Scratch ( /fast-scratch ):

When you need very fast temporary disk I/O access, you will want to use the HPC /fast-scratch file system. All users can access the /fast-scratch file-system from any node on the cluster at:

  • /fast-scratch/$USER

The HPC /fast-scratch server is a specialized and expensive node that uses 10,000 RPM 2.5" SATA drives, fast LSI Nytro controller, the BeeGFS high-performance parallel file system over the HPC Infiniband using the Mellanox network for super fast disk access.

The /fast-scratch file system is meant for when you need really fast disk I/O access and for a temporary time only (the length of your job). It is NOT to be used for extended storage. Any file older than 2-weeks will be automatically removed.

Note Use /fast-scratch for the duration of your job ONLY.
Warning Files older than 2 weeks in /fast-scratch are automatically removed on a daily basis.
Warning If /fast-scratch reaches 95% capacity or higher, the automated process will start removing files younger than 2 weeks old. This has to be done in order to keep the filesystem from going 100% full.

Please remember to move/remove your data out of /fast-scratch once your job is done.

Public Solid State Scratch ( /ssd-scratch ):

When you need very fast temporary disk I/O access on Solid State drives, you will want to use the HPC /ssd-scratch file system. All users can access the /ssd-scratch file-system from any node on the cluster at:

  • /ssd-scratch/$USER

The HPC /ssd-scratch server is a specialized and expensive node that uses 11 solid state disks (11x 500MB/s RAID 0), over the HPC Infiniband network.

Warning Note: Same rules apply to /ssd-scratch as /fast-scratch listed above.

Public Local Scratch ( /scratch ):

All compute nodes on HPC have a local disk with a /scratch partition and it is accessible by all users.

For temporary files, or if you need to write a lot of data quickly, it is recommended that you write to /scratch and once your program is done but before your job ends, you move the data from the node’s /scratch back to your public private work location.

The /scratch area is local to each node and should not be confused with regular filesystems. The advantage of /scratch is that since it’s local to the node, you can do heavy disk I/O to it without affecting mounted filesystems. For temp files, you should consider using /scratch.

The disadvantage of /scratch is that since the data is local to the node, if the node crashes, etc, the data may be lost.

Should be noted that there is a significant difference between /fast-scratch and /scratch.

  • /fast-scratch is accessible from all nodes on the cluster and it can be faster than /scratch.

  • /ssd-scratch is also accessiable from all nodes on the cluster and in some cases may be the fastest filesystem on the cluster.

  • /scratch is local space on a node and the data is ONLY accessible from that node.

Private Data Storage:

The HPC private disk storage filesystems are significantly more powerful than public storage due to several factors which are:

  • Uses the BeeGFS high-performance parallel file system which is significantly faster than NFS.

  • Several Raid servers make up the private file system instead of one or a few servers. By having several raid servers working in unison and in parallel, overall I/O throughput is that much faster.

  • Uses the Mellanox Infiniband network on HPC for data I/O. The use of Mellanox means that data can theoretically go at 40Gb/s (5GB/s) instead of 1Gb/s via normal GigE interface on a node.

Private disk space are file systems/locations that individuals and/or schools at UCI have purchased for their group’s needs.

What Disk Space Do I Have Access To?

All HPC users have access to all public file systems.

For private file system access, the way to know what filesystem location you are authorized to use is by running the q command on HPC. For example:

$ q

In the output towards the top you will see a line that reads:

HPC Groups: [  ]

Inside the brackets [ ] you will see one or more HPC groups you belong to.

Using the HPC DATA Access Tables below you can find what public and private filesystems and locations you have access to.

For example if the output from q shows that you belong to the HPC bio group, that means that you have access to /bio/$USER. If the output from q shows som (school of medicine), you then have access to /som/$USER and so on.

Depending on what HPC groups you belong to, you may have access to one or more filesystem locations.

HPC DATA Access Table & Quota:

Public / Private HPC Group Access Location Shortcut Quota FS Type

Public

All Users

/data/users/$USER

50GB per user

XFS

Public

All Users

/pub/$USER

2TB per user

XFS

Public

All Users

/fast-scratch/$USER

2 Weeks Max.

BeeGFS

Public

All Users

/ssd-scratch/$USER

2 Weeks Max.

BeeGFS

Public

All Users

/scratch

Gone when node reboots.

XFS

Private

bio

/dfs1/bio/$USER

/bio/$USER

No limit

BeeGFS

Private

som

/dfs1/som/$USER

/som/$USER

No limit

BeeGFS

Private

cbcl

/dfs1/cbcl/$USER

/cbcl/$USER

Group Quota

BeeGFS

Private

edu

/dfs1/edu/$USER

Group Quota

BeeGFS

Private

elread

/dfs1/elread/$USER

Group Quota

BeeGFS

Private

dabdub

/dfs1/dabdub/$USER

Group Quota

BeeGFS

Private

tw

/dfs1/tw/$USER

/tw/$USER

Group Quota

BeeGFS

Private

drg

/dfs1/drg/$USER

Group Quota

BeeGFS

Private

samlab

/share/samdata/$USER

100TB Size of Unit

ZFS

For Group Quota, for users who have purchased private disk space on HPC, the group quota limits can be found here:

Private Compute Nodes Filesystems:

These are file-systems that physically reside on compute nodes and are available cluser-wide. All compute node file systems are XFS type and uses the mdadm Linux software raid setup.

These XFS filesystems are NFS mounted via automounter and are available cluster wide from any HPC login or compute nodes at /share/xxx/$USER access location.

Public / Private HPC Group Access FS Location Access Compute Node Quota

Private

braincircuits

/share/braincircuits/$USER

compute-7-11

No limit

Private

cbcl

/share/cbcl1/$USER

compute-7-15

No limit

Private

edu

/share/edu/$USER

compute-4-11

No limit

Private

jje

/share/jje/$USER

compute-2-13

No limit

Private

kirkby

/share/dm/$USER

compute-6-1

No limit

Private

plasma

/share/dragon/$USER

compute-9-1

No limit

Private

samlab

/share/samlab/$USER

compute-2-2

No limit

Private

sam

/share/sam/$USER

compute-2-17

No limit

Private

som

/share/compute-3-[1-9]/$USER

compute-3-[1-9]

No limit

Private

krt

/share/kevin/$USER

compute-2-1

No limit

Private

rupert

/share/tjr/$USER

compute-4-7

No limit

Private

vvenugop

/share/vpc/$USER

compute-4-6

No limit

Private

chad

/share/chad/$USER

compute-3-10

No limit

Private

valdevit

/share/amd/$USER

compute-4-13

No limit

Private

bgaut

/share/bsg/$USER

compute-2-8

No limit

Private

seal

/share/seal/$USER

compute-4-26

No limit

File-Sytems Outside of HPC:

These are data servers that physically reside OUTSIDE of HPC and as such disk I/O will always be much slower than internal HPC data servers since the data traffic usually has to go over several switches and buildings. These filesystems are NFS mounted via automounter.

Public / Private HPC Group Access FS Location Access Quota

Private

tw

/share/shwstore/$USER

No limit

Private

tw

/share/djtstore/$USER

No limit

Private

tw

/share/djtstore2/$USER

No limit

Private

tw

/share/djtstore3/$USER

No limit

Private

tw

/share/applications/$USER

No limit

Private

tw

/share/projects/$USER

No limit

Private

tw

/share/projects1/$USER

No limit

Private

tw

/share/projects2/$USER

No limit

Transferring Data To & From HPC:

Backups:

Can I Purchase Private Disk Space?

Yes. You should try the public areas first to see if they meet your needs. If they do not meet your data needs, then please email hpc-support at:

for pricing information and give us an idea of how much disk space you are looking for.