HPC and GPU computing at RGNC

NVIDIA CUDA Research Center

The University of Seville, through the Research Group on Natural Computing, is a NVIDIA GPU Research Center for 2016, as it was CUDA Research Center during 2014 and 2015. NVIDIA donated a Tesla K40 to our group in 2014. The PIs of the CRC for 2016 are Agustín Riscos-Núñez and Mario J. Pérez-Jiménez.

Transcript of the news in 2014, when NVIDIA first granted the CRC to the University of Seville:

The american company NVIDIA has granted a project presented by Miguel A. Martínez del Amor and Mario J. Pérez-Jiménez, members of the Research Group on Natural Computing of the University of Seville, given the "vision, quality, and impact of the research leveraging CUDA technology".

As a result, the University of Seville has been selected as a CUDA Research Center for the year 2014. Thus, until January 29, 2015, the University of Seville will enjoy a series of benefits, including a graphics card Tesla K40 (last generation) for the research group on Natural Computing, discounts on equipment from NVIDIA under the Centers Reward Program, online training sessions tailored for the University of Seville, and support of the NVIDIA technical personnel.

The head of the research group, Mario J. Pérez Jiménez, considers that this project award will represent an important/decisive impulse to the acceleration process of the simulators that are under development in the framework of unconventional cellular machines

More information:



General Information

The Research Group on Natural Computing holds High Performance Computing (HPC) and GPU computing servers in its facilities. They have been used to develop and run the first GPU-accelerated simulators for P systems. These simulators can be downloaded from the PMCGPU software project. Today, they are used for GPU computing applied to Membrane Computing simulators and beyond, by using CUDA and OpenCL.

The servers are connected to UPC - power protection system, and installed inside a cooled room. The access to these servers are restricted to RGNC members and collaborators. If you want to collaborate with us, contact our group. We are open to new ideas and projects!

The administrators are Luis Felipe Macías-Ramos, Daniel Hugo Campora and Miguel A. Martínez-del-Amor. So if you have any question, new software to install, problems, etc, please, email anyone of them.




Features

We currently hold two GPU computing servers, named after the two highest peaks in Spain.

Teide:

  • Supermicro Server 7046GT-TRF
  • CPU: 8 cores (2-socket 4-core) Intel i5 Nehalem E5504 @ 2.00GHz
  • Main memory: 12 GBytes DDR3 @ 1333Mhz
  • GPUs:
    • 3 x NVIDIA Tesla C1060 - 240 cores (30 SMs x 8 SPs) @ 1.30Ghz, 4 GBytes GDDR3 (provided by our R&D projects)
    • 1 x NVIDIA GeForce GTX 550 Ti - 192 cores (4 SMs x 48 SPs) @ 1.90Ghz, 1 GBytes GDDR3 (provided by Manuel García).
    • CUDA 6.5
  • Operating system: Linux Ubuntu Server 14.04.4 LTS 64 bits
  • Programming languages:
    • C/C++: gcc/g++ 4.8.4, 4.6.3 and 4.4.7. Libraries: GSL, boost
    • Python: 2.7.6 and 3.4. Libraries: pycuda, pyopencl, scientific.
    • Java: Oracle SE 1.8.0, and OpenJDK 1.6.0.
    • R: 3.0.2. RStudio.
    • Fortran: gfortran 4.8.4 and 4.6.3.
  • Queue system: SLURM (Simple Linux Utility for Resource Management) 2.6.5

Mulhacen:

  • Even Server Xeon
  • CPU: 4 cores (with Hyperthreading, up to 8 "virtual" cores) Intel i5 Xeon E3-1230V3 @ 3.30GHz
  • Main memory: 32 GBytes DDR3 @ 2400Mhz
  • GPUs:
    • 1 x NVIDIA Tesla K40c - 2880 cores (15 SMXs x 192 SPs) @ 0.88Ghz, 12 GBytes GDDR3 (provided by NVIDIA under the CUDA Research Center program)
    • 1 x NVIDIA GeForce GTX 780 Ti - 2880 cores (15 SMXs x 192 SPs) @ 0.93Ghz, 3 GBytes GDDR3 (provided by our R&D projects).
    • CUDA 7.5
  • Operating system: Linux CentOS 7.5 64 bits
  • Programming languages:
    • C/C++: gcc/g++ 5.3.0 (default), 4.8.5. Libraries: GSL, boost
    • Python: 2.7.6 and 3.4.0.
    • Java: OpenJDK 1.8.0.
    • Fortran: gfortran 5.3.0 (default), 4.8.5.
  • Other libraries:
  • Queue system: SLURM (Simple Linux Utility for Resource Management) 15.08




Funding and Acknowledgments

The Mulhacen GPU server at RGNC was provided by the R&D project TIN2012-37434 (funded by Ministerio de Economía y Competitividad of Gobierno de España), co-financed by the European FEDER funds. The Tesla K40c GPU was a donation by NVIDIA under the CUDA Research Center program.

The Teide GPU server at RGNC was provided by the R&D projects P08-TIC4200 (funded by Consejería de Economía, Innovación y Ciencia of Junta de Andalucía) and TIN2009-13192 (funded by Ministerio de Ciencia e Innovación of Gobierno de España), both co-financed by the European FEDER funds.

Please, consider to acknowledge our R&D projects in your papers if you are using our GPU server for your research (also notify us about this).




Quick usage manual

Teide Server:

CUDA 6.5 is installed on the server. If you want to have the sample codes in your home directory, just run the following instruction in the folder you want to have it (it will create a new folder): cuda-install-samples-6.5.sh .

Our GPU server is devoted for computing, so it is important to take under control all the available resources. We have moved from Grid Engine to SLURM for its easy way to manage resources, what fits well for our purposes.

As it can be seen in the figure, there are four GPUs, but divided into two groups:

  • Group 1: When a user log in by ssh protocol, he can see and use only two GPUs (device 0 is Tesla C1060 and device 1 is GeForce GTX). They are in exclusive mode, what means that only one user can use it at a time. Use it for Testing/debug your code.
  • Group 2: They are out of scope for the users in a normal session. However, they can be reached by SLURM commands. They are managed by a queue system that allows to take control of which jobs are executed, and allow users to submit jobs and go away without waiting for its termination (it is possible to give an email for alerts). Use it for Production/debug.

Therefore, GPUs of Group 1 can be used as is, by executing CUDA/OpenCL programs from the terminal or scripts (e.g. "./deviceQuery"). However, GPUs of Group 2 are under control of SLURM, but they are also easy to use. Please, consider to use the following commands:

  • sbatch: submit a script or binary into the devices queue. These jobs will be executed as long as there are available resources. Options:
    • --gres=gpu:X : ask SLURM to use X GPUs (X can be 1 or 2, e.g. --gres=gpu:1 or --gres=gpu:2). It is mandatory if your program will use CUDA devices.
    • -c : number of CPUs to be used by your job.
    • --mail-type=type : notify on state change: BEGIN, END, FAIL or ALL. Require --mail-user option.
    • --mail-user=user : who to send email notification for job state changes. Require --mail-type option.
    • Examples:
      • Execute a script to run my program, making use of one GPU: sbatch --gres=gpu:1 ./myscript
      • Execute a script to run my program, making use of two GPUs with two CPU threads, and providing my email to know when it finishes: sbatch --gres=gpu:2 -c 2 --mail-type=ALL --mail-user=mdelamor@cs.us.es ./myscript
  • srun: submit a job script into the devices queue, but it is interactively executed. That is, this command does the same than executing your binary in the terminal (it prints the output on screen and reads from keyboard), but it will run through SLURM (it will wait until available resources, etc). This is the best choice if you want to test your program. The options are the same than sbatch instruction. For example: srun --gres=gpu:1 ./myprogram -i myinputfile.txt -o myoutputfile.txt -s 100 -t 143
  • squeue: prompt the queue state, how many jobs are being executed and how many waiting.
  • scancel: cancel a submited job providing its ID.
  • scontrol show nodes: show information of our compute node.

IMPORTANT: Keep your code with "standard" CUDA (please, delete the previous code provided for Grid Engine); that is, your CUDA code can assume that there are only two devices with id 0 and 1 (but in reality, they are 2 and 3, SLURM magic!). The system allows the execution of concurrent jobs if they need only 1 GPU each, but they both will see only one GPU with id 0. Thus, keep your code generic and use only the default device of the system, SLURM will take care of giving you an idle device.

Task affinity is also configured, so you can assign your threads to singular cores you want to use. Read the manuals to know how.

Mulhacen Server:

CUDA 6.5 is also installed on the server. If you want to have the sample codes in your home directory, just run the following instruction in the folder you want to have it (it will create a new folder): cuda-install-samples-6.5.sh .

As in the case of Teide Server, Mulhacen GPU server is managed by SLURM.

As it can be seen in the figure, there are three GPUs, but divided into two groups:

  • Group 1: When a user log in by ssh protocol, he can see and use only two GPUs (device 1 is GTX 780 Ti and device 2 is GeForce 9400 GT). They are in exclusive mode, what means that only one user can use it at a time. Use it for Testing/debug your code.
  • Group 2: Our Tesla K40 is out of scope for the users in a normal session. However, it can be reached by SLURM commands. Use it for Production/debug.

Therefore, GPUs of Group 1 can be used as is, by executing CUDA/OpenCL programs from the terminal or scripts (e.g. "./deviceQuery"). However, the GPU of Group 2 is under control of SLURM. Use the following commands in order to use the Tesla K40 (in Group 2):

  • sbatch or srun: submit a script/binary into the devices queue. Options:
    • --gres=gpu:1 : ask SLURM to use the K40 GPU. It is mandatory if your program will use CUDA devices, otherwise no device will be given.
    • -c : number of CPUs to be used by your job.

Further information and tutorials are also available at the following pages:

Access to our servers

Our servers are accessible by the members of the RGNC and all our collaborators. Just get in touch with us if you want to be a collaborator.

Download the next guide in order to know how to configure your machine for CUDA development using our servers: