HPC GPU Node:


NVIDIA Corporation has graciously donated four (4) of their top high-end Tesla M2090 GPU cards to the HPC Cluster at UCI for your research needs.

Each NVIDIA Tesla M2090 card has the following attributes:
Peak double precision floating point performance 665 Gigaflops
Peak single precision floating point performance 1331 Gigaflops
Memory bandwidth (ECC off) 177 GBytes/sec
Memory size (GDDR5) 6 GigaBytes
CUDA cores 512


The GPU node ( compute-1-14 ) has dual Intel Xeon DP E5645 2.4GHz 12MB cache (24 cores) CPUs with 96GB DDR3 1333Mhz of main memory. 

There are a total of 2,048 CUDA cores with the 4 Tesla M2090 NVIDIA cards.

When requesting GPU resources, please try requesting 6 Intel cores per each gpu card you request.  Since the node has 24 Intel cores, the division comes out to 6 Intel cores per each GPU card.   

There are no fixed numbers when requesting cores verses GPU cards, it all depends on the running program.  If  you can run with 2 Intel cores and 2 GPU cards, then use those numbers.


Consider the following CUDA script file is available at:
~demo/hello-cuda.sh

    $ cat  ~demo/hello-cuda.sh
 #$ -q gpu
 Requesting the GPU queue.
 #$ -l gpu=1
Requesting 1 gpu card out of 4 avilable gpu cards.
 #$ -pe gpu-node-cores 6
 Run with the Parallel Enviroment "gpu-node-core" requesting 6 node cores.

Let's run a cuda hello world example:

$ mkdir cuda
$ cd cuda
$ cp ~demo/hello-cuda.sh  .
$ qsub hello-cuda.sh
$ qstat


Check the directory for the output "out" file and other files the script created.



How many GPU's are available now?

As mentioned above, the GPU compute-1-14 node has 4 GPU cards.    To see how many gpus are currently avaialble use:

$ qhost -F gpu -h compute-1-14
HOSTNAME           NCPU NSOC NCOR NTHR  LOAD  MEMTOT  MEMUSE  SWAPTO  SWAPUS
--------------------------------------------------------------------------------
compute-1-14        24    2   12   24  0.69   94.6G    1.8G   94.4G     0.0
    Host Resource(s):      hc:gpu=4.000000

GPU compute node compute-1-14 has 4 gpu's available.



CUDA-Compilers

CUDA compiler, debugger and libs are available with:

    module load  nvidia-cuda/5.0



CUDA Documentation:

On the HPC cluster, you can get additional help files at /data/apps/cuda/doc  or by clicking on this link.

The SDK CUDA Toolkit has been installed in /data/apps/cuda/NVIDIA_GPU_Computing_SDK

CUDA SDK Toolkit Documentation is also available from this link.




NVIDIA-SMI

To display the GPU information, you can use the qrsh command as follows:

$ qrsh -q gpu nvidia-smi 

Fri Apr 19 10:10:01 2012      
+------------------------------------------------------+                      
| NVIDIA-SMI 3.295.41   Driver Version: 295.41         |                      
|-------------------------------+----------------------+----------------------+
| Nb.  Name                     | Bus Id        Disp.  | Volatile ECC SB / DB |
| Fan   Temp   Power Usage /Cap | Memory Usage         | GPU Util. Compute M. |
|===============================+======================+======================|
| 0.  Tesla M2090               | 0000:04:00.0  Off    |         0          0 |
|  N/A    N/A  P0    77W / 225W |   6%  330MB / 5375MB |   31%     Default    |
|-------------------------------+----------------------+----------------------|
| 1.  Tesla M2090               | 0000:05:00.0  Off    |         0          0 |
|  N/A    N/A  P12   29W / 225W |   0%   10MB / 5375MB |    0%     Default    |
|-------------------------------+----------------------+----------------------|
| 2.  Tesla M2090               | 0000:83:00.0  Off    |         0          0 |
|  N/A    N/A  P12   27W / 225W |   0%   10MB / 5375MB |    0%     Default    |
|-------------------------------+----------------------+----------------------|
| 3.  Tesla M2090               | 0000:84:00.0  Off    |         0          0 |
|  N/A    N/A  P12   28W / 225W |   0%   10MB / 5375MB |    0%     Default    |
|-------------------------------+----------------------+----------------------|
| Compute processes:                                               GPU Memory |
|  GPU  PID     Process name                                       Usage      |
|=============================================================================|
|  0.  13951    ...namd/NAMD_2.9b3_Linux-x86_64-multicore-CUDA/namd2   317MB  |
+-----------------------------------------------------------------------------+

In the display above, Tesla #0 is active and has a load of 31%.   All other Tesla cards are idle ( 0% utilization ).

You can get additional help for nvidia-smi on compute-1-14 with:



If you are familiar with using GPU and like to contributing to help others learn how to use the GPU node, please let me know and I will post in on the HPC How To list.