Difference between revisions of "Programming/Python"

From HPC
Jump to: navigation , search
m
m (Next Steps)
 
(73 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
== Programming Details ==
 
== Programming Details ==
  
 +
Python is a widely used high-level programming language used for general-purpose programming.
  
An interpreted language, Python has a design philosophy which emphasizes code readability (notably using whitespace indentation to delimit code blocks rather than curly braces or keywords), and a syntax which allows programmers to express concepts in fewer lines of code than possible in languages such as C++ or Java. scale.
+
An interpreted language, Python has a design philosophy that emphasizes code readability (notably using white-space indentation to delimit code blocks rather than curly braces or keywords), and a syntax that allows programmers to express concepts in fewer lines of code than possible in languages such as [[programming/C-Plusplus|C++]] or [[programming/Java|JAVA]]
 +
{|
 +
|style="width:5%; border-width: 0" | [[File:icon_pencil.png]]
 +
|style="width:95%; border-width: 0" | When programming with Python in an HPC environment you will need to change the first line as shown below
 +
|-
 +
|}
 +
<pre style="background-color: #f5f5dc; color: black; font-family: monospace, sans-serif;">
  
When programming with Python in a HPC environment you will need to change the first line from:
+
#!/usr/bin/python
  
<pre style="background-color: #C8C8C8; color: black; border: 2px solid black; font-family: monospace, sans-serif;">
 
#!/usr/bin/python
 
 
</pre>
 
</pre>
  
 
To  
 
To  
  
<pre style="background-color: #C8C8C8; color: black; border: 2px solid black; font-family: monospace, sans-serif;">
+
 
 +
<pre style="background-color: #f5f5dc; color: black; font-family: monospace, sans-serif;">
 +
 
 
  #!/usr/bin/env python
 
  #!/usr/bin/env python
 +
 +
</pre>
 +
 +
==== Python example ====
 +
 +
 +
<pre style="background-color: #f5f5dc; color: black; font-family: monospace, sans-serif;">
 +
 +
#!/usr/bin/env python
 +
 +
from mpi4py import MPI
 +
 +
comm = MPI.COMM_WORLD
 +
rank = comm.Get_rank()
 +
 +
if rank == 0:
 +
  data = {'key1' : [7, 2.72, 2+3j],
 +
          'key2' : ( 'abc', 'xyz')}
 +
else:
 +
  data = None
 +
 +
data = comm.bcast(data, root=0)
 +
 +
if rank != 0:
 +
        print ("data is %s and %d" % (data,rank))
 +
else:
 +
        print ("I am master\n")
 
</pre>
 
</pre>
 +
{|
 +
|style="width:5%; border-width: 0" | [[File:icon_exclam3.png]]
 +
|style="width:95%; border-width: 0" | Due to the limitations of the Python interpreter (GIL) you need either multiprocessing (mpi4py) or use C extensions that release GIL during computations e.g., Like NumPy functions which are compiled binaries for speed.
 +
|}
 +
==== Modules Available ====
 +
 +
The following modules are available:
 +
 +
<strong>Python provided by the Python Software foundation</strong>
 +
 +
* module add python/2.7.11
 +
* module add python/3.5.1
 +
 +
<strong>Anaconda python Anaconda is the open data science platform powered by Continuum</strong>
 +
 +
* module add python/anaconda/4.0/2.7
 +
* module add python/anaconda/4.0/3.5
 +
* module add python/anaconda/4.1.1/2.7
 +
 +
and miniconda is provided by
 +
 +
* module add python/anaconda/4.3.31/3.6-VE
 +
* module add python/anaconda/4.6/miniconda/3.7
 +
* module add python/anaconda/202111/3.9
 +
* module add python/anaconda/20220712/3.9
  
Note due to the limited of the Python interpreter (GIL) there is no point to use threads for CPU intensive tasks in Python. You need either multiprocessing or use C extensions that release GIL during computations e.g., some of numpy functions, example.
+
==== Compilation ====
 +
 
 +
Python is byte-compiled at runtime by typing for example
 +
 
 +
<pre style="background-color: black; color: white; border: 2px solid black; font-family: monospace, sans-serif;">
 +
 
 +
[username@login01 ~]$  python myPython.py
 +
 
 +
</pre>
  
  
 
== Usage Examples ==
 
== Usage Examples ==
  
= Batch example =
+
=== Batch example ===
  
<pre style="background-color: #C8C8C8; color: black; border: 2px solid blue; font-family: monospace, sans-serif;">
+
 
 +
<pre style="background-color: #C8C8C8; color: black; border: 2px solid #C8C8C8; font-family: monospace, sans-serif;">
 
   
 
   
 
#!/bin/bash
 
#!/bin/bash
Line 29: Line 97:
 
#SBATCH -N 1
 
#SBATCH -N 1
 
#SBATCH --ntasks-per-node 20
 
#SBATCH --ntasks-per-node 20
#SBATCH -D /home/pysdlb/PYTHON
 
 
#SBATCH -o %N.%j.%a.out
 
#SBATCH -o %N.%j.%a.out
 
#SBATCH -e %N.%j.%a.err
 
#SBATCH -e %N.%j.%a.err
 
#SBATCH -p compute
 
#SBATCH -p compute
 
#SBATCH --exclusive
 
#SBATCH --exclusive
 +
#SBATCH --mail-user= your email address here
  
 
echo $SLURM_JOB_NODELIST
 
echo $SLURM_JOB_NODELIST
  
 
module purge
 
module purge
module load anaconda/4.0
+
module add python/anaconda/20220712/3.9
module load openmpi/gcc/1.10.2
+
module add openmpi/gcc/1.10.2
 
 
export I_MPI_DEBUG=5
 
export I_MPI_FABRICS=shm:tmi
 
export I_MPI_FALLBACK=no
 
  
 
mpirun python broadcast.py
 
mpirun python broadcast.py
Line 50: Line 114:
  
  
<pre style="background-color: #C8C8C8; color: black; border: 2px solid black; font-family: monospace, sans-serif;">
+
<pre style="background-color: black; color: white; border: 2px solid black; font-family: monospace, sans-serif;">
 
[username@login01 ~]$ sbatch python-demo.job
 
[username@login01 ~]$ sbatch python-demo.job
 
Submitted batch job 289572
 
Submitted batch job 289572
 
</pre>
 
</pre>
 +
 +
==Python and OpenMP==
 +
 +
The Python interpreter uses a  GIL (Global Interpreter Lock) which makes multi-threading almost impossible. Although there are ways around this and one of the most common methods is to use '''Cython'''.
 +
 +
Cython supports native parallelism through the '''cython.parallel''' module. To use this kind of parallelism, the GIL must be released (see Releasing the GIL). It currently supports OpenMP
 +
 +
* Example with a reduction (on sum):
 +
 +
<pre style="background-color: #f5f5dc; color: black; font-family: monospace, sans-serif;">
 +
 +
#!/usr/bin/env python
 +
from cython.parallel import prange
 +
 +
cdef int i
 +
cdef int n = 30
 +
cdef int sum = 0
 +
 +
for i in prange(n, nogil=True):
 +
    sum += i
 +
 +
print(sum)
 +
</pre>
 +
 +
* Example with a typed memoryview (e.g. a NumPy array):
 +
 +
<pre style="background-color: #f5f5dc; color: black; font-family: monospace, sans-serif;">
 +
 +
#!/usr/bin/env python
 +
from cython.parallel import prange
 +
 +
def func(double[:] x, double alpha):
 +
    cdef Py_ssize_t i
 +
 +
    for i in prange(x.shape[0]):
 +
        x[i] = alpha * x[i]
 +
</pre>
 +
 +
To actually use the OpenMP support, you need to tell the C or C++ compiler to enable OpenMP. For GCC this can be done as follows in a setup.py:
 +
 +
<pre style="background-color: #f5f5dc; color: black; font-family: monospace, sans-serif;">
 +
 +
#!/usr/bin/env python
 +
from distutils.core import setup
 +
from distutils.extension import Extension
 +
from Cython.Build import cythonize
 +
 +
ext_modules = [
 +
    Extension(
 +
        "hello",
 +
        ["hello.pyx"],
 +
        extra_compile_args=['-fopenmp'],
 +
        extra_link_args=['-fopenmp'],
 +
    )
 +
]
 +
 +
setup(
 +
    name='hello-parallel-world',
 +
    ext_modules=cythonize(ext_modules),
 +
)
 +
</pre>
 +
 +
* See [http://docs.cython.org/en/latest/src/userguide/parallelism.html Cython Parallelism] documentation for more details here.
 +
 +
==Numba==
 +
Alternatively to Cython, in cases where appropriate (if you are using NumPy this may be useful!) Numba can provide slightly worse performance but is much simpler to use.
 +
 +
For example a matrix multiplication:
 +
<pre class="mw-collapsible mw-collapsed" style="background-color: #C8C8C8; color: black; font-family: monospace, sans-serif;">
 +
@njit(parallel=True)
 +
def matmult(a,b):
 +
        assert A.shape[1] == B.shape[0]
 +
        res = np.zeros((A.shape[0], B.shape[1]), )
 +
        for i in prange (A.shape[0]):
 +
                for k in range (A.shape[1]):
 +
                        for j in range(B.shape[1]):
 +
                                res[i,j]+=A[i,k] * B[k,j]
 +
        return res
 +
</pre>
 +
 +
Numba also has support for CUDA GPU programming: [https://numba.readthedocs.io/en/stable/cuda/overview.html https://numba.readthedocs.io/en/stable/cuda/overview.html]
 +
 +
Visit the Numba website for more details: [https://numba.pydata.org/ https://numba.pydata.org/].
 +
 +
== Next Steps ==
 +
* [https://en.wikipedia.org/wiki/Python_(programming_language) https://en.wikipedia.org/wiki/Python_(programming_language)]
 +
* [https://www.python.org/ https://www.python.org/]
 +
* [https://www.continuum.io/anaconda-overview https://www.continuum.io/anaconda-overview]
 +
* [[programming/OpenMPI|OpenMPI]]
 +
* [[programming/OpenMP|OpenMP]]
 +
* [https://numba.pydata.org/ https://numba.pydata.org/]
 +
 +
 +
If you are trying to speed up your Python but need some help you can contact our RSE team in the [https://hull.service-now.com/  Support Portal]!
 +
{{Languagespagenav}}

Latest revision as of 14:18, 13 June 2024

Programming Details

Python is a widely used high-level programming language used for general-purpose programming.

An interpreted language, Python has a design philosophy that emphasizes code readability (notably using white-space indentation to delimit code blocks rather than curly braces or keywords), and a syntax that allows programmers to express concepts in fewer lines of code than possible in languages such as C++ or JAVA

Icon pencil.png When programming with Python in an HPC environment you will need to change the first line as shown below

 #!/usr/bin/python

To



 #!/usr/bin/env python

Python example


#!/usr/bin/env python

from mpi4py import MPI

comm = MPI.COMM_WORLD
rank = comm.Get_rank()

if rank == 0:
   data = {'key1' : [7, 2.72, 2+3j],
           'key2' : ( 'abc', 'xyz')}
else:
   data = None

data = comm.bcast(data, root=0)

if rank != 0:
        print ("data is %s and %d" % (data,rank))
else:
        print ("I am master\n")
Icon exclam3.png Due to the limitations of the Python interpreter (GIL) you need either multiprocessing (mpi4py) or use C extensions that release GIL during computations e.g., Like NumPy functions which are compiled binaries for speed.

Modules Available

The following modules are available:

Python provided by the Python Software foundation

  • module add python/2.7.11
  • module add python/3.5.1

Anaconda python Anaconda is the open data science platform powered by Continuum

  • module add python/anaconda/4.0/2.7
  • module add python/anaconda/4.0/3.5
  • module add python/anaconda/4.1.1/2.7

and miniconda is provided by

  • module add python/anaconda/4.3.31/3.6-VE
  • module add python/anaconda/4.6/miniconda/3.7
  • module add python/anaconda/202111/3.9
  • module add python/anaconda/20220712/3.9

Compilation

Python is byte-compiled at runtime by typing for example


[username@login01 ~]$  python myPython.py


Usage Examples

Batch example

 
#!/bin/bash
#SBATCH -J compute-single-node
#SBATCH -N 1
#SBATCH --ntasks-per-node 20
#SBATCH -o %N.%j.%a.out
#SBATCH -e %N.%j.%a.err
#SBATCH -p compute
#SBATCH --exclusive
#SBATCH --mail-user= your email address here

echo $SLURM_JOB_NODELIST

module purge
module add python/anaconda/20220712/3.9
module add openmpi/gcc/1.10.2

mpirun python broadcast.py


[username@login01 ~]$ sbatch python-demo.job
Submitted batch job 289572

Python and OpenMP

The Python interpreter uses a GIL (Global Interpreter Lock) which makes multi-threading almost impossible. Although there are ways around this and one of the most common methods is to use Cython.

Cython supports native parallelism through the cython.parallel module. To use this kind of parallelism, the GIL must be released (see Releasing the GIL). It currently supports OpenMP

  • Example with a reduction (on sum):

#!/usr/bin/env python
from cython.parallel import prange

cdef int i
cdef int n = 30
cdef int sum = 0

for i in prange(n, nogil=True):
    sum += i

print(sum)
  • Example with a typed memoryview (e.g. a NumPy array):

#!/usr/bin/env python
from cython.parallel import prange

def func(double[:] x, double alpha):
    cdef Py_ssize_t i

    for i in prange(x.shape[0]):
        x[i] = alpha * x[i]

To actually use the OpenMP support, you need to tell the C or C++ compiler to enable OpenMP. For GCC this can be done as follows in a setup.py:


#!/usr/bin/env python
from distutils.core import setup
from distutils.extension import Extension
from Cython.Build import cythonize

ext_modules = [
    Extension(
        "hello",
        ["hello.pyx"],
        extra_compile_args=['-fopenmp'],
        extra_link_args=['-fopenmp'],
    )
]

setup(
    name='hello-parallel-world',
    ext_modules=cythonize(ext_modules),
)

Numba

Alternatively to Cython, in cases where appropriate (if you are using NumPy this may be useful!) Numba can provide slightly worse performance but is much simpler to use.

For example a matrix multiplication:

@njit(parallel=True)
def matmult(a,b):
        assert A.shape[1] == B.shape[0]
        res = np.zeros((A.shape[0], B.shape[1]), )
        for i in prange (A.shape[0]):
                for k in range (A.shape[1]):
                        for j in range(B.shape[1]):
                                res[i,j]+=A[i,k] * B[k,j]
        return res

Numba also has support for CUDA GPU programming: https://numba.readthedocs.io/en/stable/cuda/overview.html

Visit the Numba website for more details: https://numba.pydata.org/.

Next Steps


If you are trying to speed up your Python but need some help you can contact our RSE team in the Support Portal!



Languages | Main Page | Further Topics