Difference between revisions of "FurtherTopics/Additional Information On Viper"

From HPC
Jump to: navigation , search
(Accelerator Nodes)
m (Infrastructure)
 
(13 intermediate revisions by the same user not shown)
Line 1: Line 1:
==What is Viper==
+
=== Physical Hardware ===
 +
Viper is based on the Linux operating system and is composed of approximately 5,500 processing cores with the following specialised areas:
 +
* 180 compute nodes, each with 2x 14-core Intel Broadwell E5-2680v4 processors (2.4 –3.3 GHz), 128 GB DDR4 RAM
 +
* 4 High memory nodes, each with 4x 10-core Intel Haswell E5-4620v3 processors (2.0 GHz), 1TB DDR4 RAM
 +
* 4 GPU nodes, each identical to compute nodes with the addition of an Nvidia Ampere A40 GPU per node
 +
* 2 Visualisations nodes with 2x Nvidia GTX 980TI
 +
* Intel [http://www.intel.com/content/www/us/en/high-performance-computing-fabrics/omni-path-architecture-fabric-overview.html Omni-Path] interconnect (100 Gb/s node-switch and switch-switch)
 +
* 500 TB parallel file system ([http://www.beegfs.com/ BeeGFS])
  
Viper is the University of Hulls High Performance Computer (HPC).
+
The compute nodes ('''compute''' and highmem ) are '''stateless''', with the GPU and visualisation nodes being stateful.  
  
Our supercomputer or more accurately named High Performance Computer (shortened to HPC) is made up of about 200 separate computers which are linked together by a very high speed network. The purpose of the network is to allow them to exchange data between them and give the appearance they are acting as one large computer. The operating system is Linux which runs on the whole of Viper. This is the system used by the majority of research systems, including all of the top 500 supercomputers in the world.
+
* Note: the compute nodes that have no persistent operating system storage between boots (e.g. originates from hard drives being the persistent storage mechanism). Stateless nodes do not necessarily imply diskless, as ours have 128Gb SSD temporary space too.
  
==Viper Infrastructure==
+
== Infrastructure ==
===Compute Nodes===
 
These make up the most of the computing nodes and perform most of the standard computing processes within the HPC. Each node has 128GByte of RAM.
 
  
===High Memory Nodes===
+
* 4 racks with dedicated cooling and hot-aisle containment (see diagram below)
These are very similar to the compute nodes but they have much more memory, ours have a 1 Tera Byte of RAM each and make them ideal for research involving large memory models like engineering and DNA analysis in biology.
+
* Additional rack for storage and management components
 +
* Dedicated, high-efficiency chiller on AS3 roof for cooling
 +
* UPS and generator power failover
  
===Accelerator Nodes===
+
[[File:Rack-diagram.jpg]]
4 GPU nodes, each identical to compute nodes with the addition of a Nvidia Ampere A40 GPU per node. These are almost similar to high end graphics cards found in gaming rigs. The usefulness of these cards is that they have thousands of very small processing cores in them and this makes them very useful for executing small amounts of code but in a massively parallel way. This is why these cards are used for the new areas of machine learning and deep learning also.
 
  
===Visualisation Nodes===
+
{| class="wikitable"
These are used for connecting from remote computers such as desktops and allow the rendered outputs from data to be viewed on a local computer. There are two visualisations nodes with 2x Nvidia GTX 980TI.
+
| style="width:20%" | <Strong>Image</Strong>
 +
| style="width:5%" | <Strong>Quantity</Strong>
 +
| style="width:75%" | <Strong>Description</Strong>
 +
|-
 +
| [[File:Node-compute.png]]
 +
| '''180'''
 +
| These are the main processing cluster nodes, each has Intel Broadwell E5-2680v4 28x cores and 128GB of RAM
 +
|-
 +
| [[File:Node-himem.jpg]]
 +
| '''4'''
 +
| These are specialised high memory nodes, each has Intel Haswell E5-4620v3 40x cores and 1 TB of RAM
 +
|-
 +
| [[File:Node-visualisation.jpg]]
 +
| '''4'''
 +
| These are specialised accelerator nodes with an Nvidia A40 per node and 128 GB of RAM
 +
|-
 +
| [[File:Node-visualisation.jpg]]
 +
| '''1'''
 +
| This is a specialised accelerator node with 2 Nvidia P100 per node and 128 GB of RAM
 +
|-
 +
| [[File:Node-visualisation.jpg]]
 +
| '''2'''
 +
| These are visualisation nodes that allow remote graphical viewing of data using Nvidia GeForce GTX 980 Ti.
 +
|-
 +
|}
  
===High Speed Network===
 
All these compute nodes are connected by a very fast Intel Omni-path network to allow the compute nodes to act together. This runs at 100Gbit/s.
 
  
===Storage Nodes===
 
These are servers in their own right which allow access to the actual storage arrays (hard disks), Viper accesses it’s disk storage via these nodes.
 
  
===Storage Array===
 
These are the actual disks held in a chassis and make up the whole file storage for Viper. Viper has a total file space of 0.5 Petabyte or 500 Terabyte.
 
  
===Controller Nodes===
+
[[Main Page | Main Page]]  /   [[FurtherTopics/FurtherTopics #Further Reading| Further Topics]]
The controller nodes (sometimes called head nodes) are responsible for the management of all the compute nodes. Managing the loading of jobs, their termination and completion via the job scheduler, for Viper this is SLURM.
 
 
 
===Login Nodes===
 
These are the nodes which allow the user to login to the cluster, this area is then used by the user to prepare jobs for the cluster. Although this is a server in it’s own right it is not used for compute work. There are also two Active Directory servers (or AD) which act as an interface between the University’s login/password system and Viper.
 
 
 
==How can viper be used?==
 
*In '''Parallel (Single Node)''': where a job can run on a single node with 28 cores.
 
 
 
*In '''Parallel(Multiple Nodes)''': where a job needs more than 28 cores.
 
 
 
*Using ac'''High Memory''' node: where a job needs a large amount of memory (currently these nodes have 1TB).
 
 
 
*Using '''GPU''' node(s): where a job can use fast GPU-accelerated calculations (for example: high end graphics manipulation)
 
 
 
*The '''Visualisation'''  nodes can be used for interactive visualisation and viewing 3D models.
 

Latest revision as of 09:41, 16 November 2022

Physical Hardware

Viper is based on the Linux operating system and is composed of approximately 5,500 processing cores with the following specialised areas:

  • 180 compute nodes, each with 2x 14-core Intel Broadwell E5-2680v4 processors (2.4 –3.3 GHz), 128 GB DDR4 RAM
  • 4 High memory nodes, each with 4x 10-core Intel Haswell E5-4620v3 processors (2.0 GHz), 1TB DDR4 RAM
  • 4 GPU nodes, each identical to compute nodes with the addition of an Nvidia Ampere A40 GPU per node
  • 2 Visualisations nodes with 2x Nvidia GTX 980TI
  • Intel Omni-Path interconnect (100 Gb/s node-switch and switch-switch)
  • 500 TB parallel file system (BeeGFS)

The compute nodes (compute and highmem ) are stateless, with the GPU and visualisation nodes being stateful.

  • Note: the compute nodes that have no persistent operating system storage between boots (e.g. originates from hard drives being the persistent storage mechanism). Stateless nodes do not necessarily imply diskless, as ours have 128Gb SSD temporary space too.

Infrastructure

  • 4 racks with dedicated cooling and hot-aisle containment (see diagram below)
  • Additional rack for storage and management components
  • Dedicated, high-efficiency chiller on AS3 roof for cooling
  • UPS and generator power failover

Rack-diagram.jpg

Image Quantity Description
Node-compute.png 180 These are the main processing cluster nodes, each has Intel Broadwell E5-2680v4 28x cores and 128GB of RAM
Node-himem.jpg 4 These are specialised high memory nodes, each has Intel Haswell E5-4620v3 40x cores and 1 TB of RAM
Node-visualisation.jpg 4 These are specialised accelerator nodes with an Nvidia A40 per node and 128 GB of RAM
Node-visualisation.jpg 1 This is a specialised accelerator node with 2 Nvidia P100 per node and 128 GB of RAM
Node-visualisation.jpg 2 These are visualisation nodes that allow remote graphical viewing of data using Nvidia GeForce GTX 980 Ti.



Main Page / Further Topics