Difference between revisions of "Applications/guppy"

From HPC
Jump to: navigation , search
m (Compute capability)
 
(24 intermediate revisions by the same user not shown)
Line 1: Line 1:
 +
__TOC__
 +
 
==Application Details==
 
==Application Details==
  
* Description: ABySS is a de novo sequence assembler intended for short paired-end reads and large genomes.
+
* Description: Local accelerated base calling for Nanopore data.
* Version: 1.5.2 (gcc-4.9.3)
+
* Version: 4.2.2, 6.1.7 and 6.4.2
* Modules: abyss/1.5.2/gcc-4.9.3
+
* Modules: guppy/cpu/4.2.2, guppy/cpu/6.1.7 and guppy/cpu/6.4.2
 +
* Modules: guppy/gpu/4.2.2, guppy/gpu/6.1.7 and guppy/gpu/6.4.2
 
* Licence: Free, open-source
 
* Licence: Free, open-source
  
==Usage Examples==
+
'''Note''': guppy version 3 is now removed from production.
  
===Assemble a small synthetic data set===
+
==Description==
  
<pre style="background-color: #f5f5dc; color: black; font-family: monospace, sans-serif;">
+
GUPPY is a program to visualize sequence annotation data of the genetic sequence data with a graphical layout. This highly interactive program allows smooth scrolling and zooming from the genomic landscape to discrete nucleic acid sequences. Its main function is the quick rendering of data on a personal computer or HPC to save the layout as a personal file. With the optional link to internet resources and printing support, this program tries to make greater use of computational media in research activity.
  
[username@login] module load abyss/1.5.2/gcc-4.9.3
 
[username@login] wget http://www.bcgsc.ca/platform/bioinfo/software/abyss/releases/1.3.4/test-data.tar.gz
 
[username@login] tar xzvf test-data.tar.gz
 
[username@login} abyss-pe k=25 name=test \
 
    in='test-data/reads1.fastq test-data/reads2.fastq'
 
  
</pre>
+
==Usage Examples==
  
===Calculate assembly contiguity statistics===
+
Using the GPU as an accelerator will make the analysis considerably faster (on large data sets). For small runs, the CPU version is recommended and probably faster by removing the GPU overhead.
  
<pre style="background-color: #f5f5dc; color: black; font-family: monospace, sans-serif;">
 
  
[username@login] module load abyss/1.5.2/gcc-4.9.3
 
[username@login] abyss-fac test-unitigs.fa
 
  
</pre>
+
===Batch example===
  
===Parallel processing===
+
<pre>
 +
#!/bin/bash
 +
#SBATCH -J guppy.job
 +
#SBATCH --exclusive
 +
#SBATCH -o gpu.%j.out
 +
#SBATCH -e gpu.%j.err
 +
#SBATCH -p gpu
 +
#SBATCH --gres=gpu
 +
#SBATCH --exclusive
  
The np option of abyss-pe specifies the number of processes to use for the parallel MPI job. Without any MPI configuration, this will allow you to use multiple cores on a single machine. To use multiple machines for assembly, you must create a hostfile for mpirun, which is described in the mpirun man page.
+
module purge
 +
module load cuda/10.1.168
 +
module load guppy/gpu/6.4.2
  
Do not run '''mpirun -np 8 abyss-pe'''. To run ABySS with 8 threads, use '''abyss-pe np=8'''. The abyss-pe driver script will start the MPI process, like so: '''mpirun -np 8 ABYSS-P.'''
+
guppy_basecaller --input_path /home/user/folderIN --save_path /home/user/folderOUT --flowcell FLO-MIN106 --kit SQK-RPB004 --min_qscore 7 --qscore_filtering -x cuda:0
 
 
The paired-end assembly stage is multithreaded, but must run on a single machine. The number of threads to use may be specified with the parameter j. The default value for j is the value of np.
 
 
 
'''Note''': this example is done on a high memory node, usually access would be achieved with the scheduler
 
 
 
<pre style="background-color: black; color: white; border: 2px solid black; font-family: monospace, sans-serif;">
 
 
 
[username@c230 ~]$ module add abyss/1.5.2/gcc-4.9.3
 
[username@c230 ~]$ mpirun abyss-pe np=40
 
  
 
</pre>
 
</pre>
  
Through '''SLURM''' this would become the script:
+
===Compute capability===
 
 
<pre style="background-color: black; color: white; border: 2px solid black; font-family: monospace, sans-serif;">
 
 
 
#!/bin/bash
 
 
 
#SBATCH -J abyss
 
#SBATCH -p highmem
 
#SBATCH -N 2
 
#SBATCH --ntasks-per-node=40
 
#SBATCH -o %N.%j.%a.out
 
#SBATCH -e %N.%j.%a.err
 
#SBATCH --exclusive
 
#SBATCH -t 00:30:00
 
 
 
module purge
 
module load abyss/1.5.2/gcc-4.9.3
 
  
#Run your ABySS commands
+
For applications that require this information:
  
abyss-pe name=test k=48 n=8 in='test-1.fa test-3.fa'
+
* NVidia A40 is 8.6 (GPU01 to GPU04)
 
+
* NVidia P100 is 6.0 (GPU05)
</pre>
 
  
 
==Further Information==
 
==Further Information==
  
* [http://computing.bio.cam.ac.uk/local/doc/abyss.html http://computing.bio.cam.ac.uk/local/doc/abyss.html]
+
* https://community.nanoporetech.com/ (login required)
  
{|
+
{{Modulepagenav}}
|style="width:5%; border-width: 0" | [[File:icon_home.png]]
 
|style="width:95%; border-width: 0" |
 
* [[Main_Page|Home]]
 
* [[Applications|Application support]]
 
* [[General|General]]
 
* [[Training|Training]]
 
* [[Programming|Programming support]]
 
|-
 
|}
 

Latest revision as of 12:15, 15 December 2022

Application Details

  • Description: Local accelerated base calling for Nanopore data.
  • Version: 4.2.2, 6.1.7 and 6.4.2
  • Modules: guppy/cpu/4.2.2, guppy/cpu/6.1.7 and guppy/cpu/6.4.2
  • Modules: guppy/gpu/4.2.2, guppy/gpu/6.1.7 and guppy/gpu/6.4.2
  • Licence: Free, open-source

Note: guppy version 3 is now removed from production.

Description

GUPPY is a program to visualize sequence annotation data of the genetic sequence data with a graphical layout. This highly interactive program allows smooth scrolling and zooming from the genomic landscape to discrete nucleic acid sequences. Its main function is the quick rendering of data on a personal computer or HPC to save the layout as a personal file. With the optional link to internet resources and printing support, this program tries to make greater use of computational media in research activity.


Usage Examples

Using the GPU as an accelerator will make the analysis considerably faster (on large data sets). For small runs, the CPU version is recommended and probably faster by removing the GPU overhead.


Batch example

#!/bin/bash
#SBATCH -J guppy.job
#SBATCH --exclusive
#SBATCH -o gpu.%j.out
#SBATCH -e gpu.%j.err
#SBATCH -p gpu
#SBATCH --gres=gpu
#SBATCH --exclusive

module purge
module load cuda/10.1.168
module load guppy/gpu/6.4.2

guppy_basecaller --input_path /home/user/folderIN --save_path /home/user/folderOUT --flowcell FLO-MIN106 --kit SQK-RPB004 --min_qscore 7 --qscore_filtering -x cuda:0 

Compute capability

For applications that require this information:

  • NVidia A40 is 8.6 (GPU01 to GPU04)
  • NVidia P100 is 6.0 (GPU05)

Further Information





Modules | Main Page | Further Topics