TensorflowforGPU

From HPC
Revision as of 08:29, 21 April 2023 by Pysdlb (talk | contribs) (Building a Virtual Environment)

Jump to: navigation , search

Introduction

This page is specifically for people intending to use the TensorFlow package on a GPU-based node, this will also touch on the package Pytorch as well.

Building a Virtual Environment

To build a virtual environment for the GPU nodes you must specify the packages that will run with a GPU.

[pysdlb@login01 ~]$ module load python/anaconda/20220712/3.9
[pysdlb@login01 ~]$ conda create -n tensorflow01
Collecting package metadata (current_repodata.json): done
Solving environment: done
:
:# To activate this environment, use
#
#     $ conda activate tensorflow01

[pysdlb@login01 ~]$ conda activate tensorflow01
(tensorflow01) [pysdlb@login01 ~]$

## Package Plan ##

  environment location: /home/pysdlb/.conda/envs/tensorflow01

  added / updated specs:
    - tensorflow-gpu


The following packages will be downloaded:
:
:
  tensorboard        conda-forge/noarch::tensorboard-2.6.0-pyhd8ed1ab_1
  tensorboard-data-~ conda-forge/linux-64::tensorboard-data-server-0.6.1-py39hd97740a_4
  tensorboard-plugi~ conda-forge/noarch::tensorboard-plugin-wit-1.8.1-pyhd8ed1ab_0
  tensorflow         conda-forge/linux-64::tensorflow-2.6.2-cuda112py39h9333c2f_1
  tensorflow-base    conda-forge/linux-64::tensorflow-base-2.6.2-cuda112py39he9472f8_1
  tensorflow-estima~ conda-forge/linux-64::tensorflow-estimator-2.6.2-cuda112py39h9333c2f_1
  tensorflow-gpu     conda-forge/linux-64::tensorflow-gpu-2.6.2-cuda112py39h0bbbad9_1
  termcolor          conda-forge/noarch::termcolor-1.1.0-pyhd8ed1ab_3
  tk                 conda-forge/linux-64::tk-8.6.12-h27826a3_0
  typing-extensions  conda-forge/noarch::typing-extensions-3.7.4.3-0
  typing_extensions  conda-forge/noarch::typing_extensions-3.7.4.3-py_0
  tzdata             conda-forge/noarch::tzdata-2023c-h71feb2d_0
  urllib3            conda-forge/noarch::urllib3-1.26.15-pyhd8ed1ab_0
  werkzeug           conda-forge/noarch::werkzeug-2.2.3-pyhd8ed1ab_0
  wheel              conda-forge/noarch::wheel-0.40.0-pyhd8ed1ab_0
  wrapt              conda-forge/linux-64::wrapt-1.12.1-py39h3811e60_3
  xz                 conda-forge/linux-64::xz-5.2.6-h166bdaf_0
  yarl               conda-forge/linux-64::yarl-1.8.2-py39hb9d737c_0
  zipp               conda-forge/noarch::zipp-3.15.0-pyhd8ed1ab_0
  zlib               conda-forge/linux-64::zlib-1.2.13-h166bdaf_4


Proceed ([y]/n)?
:
:
Preparing transaction: done
Verifying transaction: done
Executing transaction: \
:
:
(tensorflow01) [pysdlb@login01 ~]$ interactive -pgpu
Job ID 3678519 connecting to gpu03, please wait...
Last login: Thu Apr 13 08:11:40 2023 from login01.cluster

[pysdlb@gpu03 ~]$ module load cuda/10.1.168
[pysdlb@gpu03 ~]$ conda activate tensorflow01

(tensorflow01) [pysdlb@gpu03 2_BasicModels]$ python logistic_regression.py

2023-04-21 09:25:02.678493: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2023-04-21 09:25:02.846792: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105] Found device 0 with properties:
name: NVIDIA A40 major: 8 minor: 6 memoryClockRate(GHz): 1.74
pciBusID: 0000:02:00.0
totalMemory: 44.37GiB freeMemory: 521.56MiB
2023-04-21 09:25:02.846830: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: NVIDIA A40, pci bus id: 0000:02:00.0, compute capability: 8.6)

etc....

Running on a GPU

Further Information