FurtherTopics/Additional Information On Viper

From HPC
Revision as of 09:57, 8 November 2022 by Pysdlb (talk | contribs) (Accelerator Nodes)

Jump to: navigation , search

What is Viper

Viper is the University of Hulls High Performance Computer (HPC).

Our supercomputer or more accurately named High Performance Computer (shortened to HPC) is made up of about 200 separate computers which are linked together by a very high speed network. The purpose of the network is to allow them to exchange data between them and give the appearance they are acting as one large computer. The operating system is Linux which runs on the whole of Viper. This is the system used by the majority of research systems, including all of the top 500 supercomputers in the world.

Viper Infrastructure

Compute Nodes

These make up the most of the computing nodes and perform most of the standard computing processes within the HPC. Each node has 128GByte of RAM.

High Memory Nodes

These are very similar to the compute nodes but they have much more memory, ours have a 1 Tera Byte of RAM each and make them ideal for research involving large memory models like engineering and DNA analysis in biology.

Accelerator Nodes

4 GPU nodes, each identical to compute nodes with the addition of a Nvidia Ampere A40 GPU per node. These are almost similar to high end graphics cards found in gaming rigs. The usefulness of these cards is that they have thousands of very small processing cores in them and this makes them very useful for executing small amounts of code but in a massively parallel way. This is why these cards are used for the new areas of machine learning and deep learning also.

Visualisation Nodes

These are used for connecting from remote computers such as desktops and allow the rendered outputs from data to be viewed on a local computer. There are two visualisations nodes with 2x Nvidia GTX 980TI.

High Speed Network

All these compute nodes are connected by a very fast Intel Omni-path network to allow the compute nodes to act together. This runs at 100Gbit/s.

Storage Nodes

These are servers in their own right which allow access to the actual storage arrays (hard disks), Viper accesses it’s disk storage via these nodes.

Storage Array

These are the actual disks held in a chassis and make up the whole file storage for Viper. Viper has a total file space of 0.5 Petabyte or 500 Terabyte.

Controller Nodes

The controller nodes (sometimes called head nodes) are responsible for the management of all the compute nodes. Managing the loading of jobs, their termination and completion via the job scheduler, for Viper this is SLURM.

Login Nodes

These are the nodes which allow the user to login to the cluster, this area is then used by the user to prepare jobs for the cluster. Although this is a server in it’s own right it is not used for compute work. There are also two Active Directory servers (or AD) which act as an interface between the University’s login/password system and Viper.

How can viper be used?

  • In Parallel (Single Node): where a job can run on a single node with 28 cores.
  • In Parallel(Multiple Nodes): where a job needs more than 28 cores.
  • Using acHigh Memory node: where a job needs a large amount of memory (currently these nodes have 1TB).
  • Using GPU node(s): where a job can use fast GPU-accelerated calculations (for example: high end graphics manipulation)
  • The Visualisation nodes can be used for interactive visualisation and viewing 3D models.