Difference between revisions of "Applications/guppy"

From HPC
Jump to: navigation , search
m
m (Compute capability)
 
(19 intermediate revisions by the same user not shown)
Line 3: Line 3:
 
==Application Details==
 
==Application Details==
  
* Description: ABySS is a de novo sequence assembler intended for short paired-end reads and large genomes.
+
* Description: Local accelerated base calling for Nanopore data.
* Version: 1.5.2 (gcc-4.9.3)
+
* Version: 4.2.2, 6.1.7 and 6.4.2
* Modules: abyss/1.5.2/gcc-4.9.3
+
* Modules: guppy/cpu/4.2.2, guppy/cpu/6.1.7 and guppy/cpu/6.4.2
 +
* Modules: guppy/gpu/4.2.2, guppy/gpu/6.1.7 and guppy/gpu/6.4.2
 
* Licence: Free, open-source
 
* Licence: Free, open-source
  
==Usage Examples==
+
'''Note''': guppy version 3 is now removed from production.
  
===Assemble a small synthetic data set===
+
==Description==
  
<pre style="background-color: #f5f5dc; color: black; font-family: monospace, sans-serif;">
+
GUPPY is a program to visualize sequence annotation data of the genetic sequence data with a graphical layout. This highly interactive program allows smooth scrolling and zooming from the genomic landscape to discrete nucleic acid sequences. Its main function is the quick rendering of data on a personal computer or HPC to save the layout as a personal file. With the optional link to internet resources and printing support, this program tries to make greater use of computational media in research activity.
  
[username@login] module add abyss/1.5.2/gcc-4.9.3
 
[username@login] wget http://www.bcgsc.ca/platform/bioinfo/software/abyss/releases/1.3.4/test-data.tar.gz
 
[username@login] tar xzvf test-data.tar.gz
 
[username@login} abyss-pe k=25 name=test \
 
    in='test-data/reads1.fastq test-data/reads2.fastq'
 
  
</pre>
+
==Usage Examples==
 
 
===Calculate assembly contiguity statistics===
 
 
 
<pre style="background-color: #f5f5dc; color: black; font-family: monospace, sans-serif;">
 
 
 
[username@login] module add abyss/1.5.2/gcc-4.9.3
 
[username@login] abyss-fac test-unitigs.fa
 
  
</pre>
+
Using the GPU as an accelerator will make the analysis considerably faster (on large data sets). For small runs, the CPU version is recommended and probably faster by removing the GPU overhead.
  
===Parallel processing===
 
  
The np option of abyss-pe specifies the number of processes to use for the parallel MPI job. Without any MPI configuration, this will allow you to use multiple cores on a single machine. To use multiple machines for assembly, you must create a hostfile for mpirun, which is described in the mpirun man page.
 
  
Do not run '''mpirun -np 8 abyss-pe'''. To run ABySS with 8 threads, use '''abyss-pe np=8'''. The abyss-pe driver script will start the MPI process, like so: '''mpirun -np 8 ABYSS-P.'''
+
===Batch example===
 
 
The paired-end assembly stage is multithreaded, but must run on a single machine. The number of threads to use may be specified with the parameter j. The default value for j is the value of np.
 
 
 
'''Note''': this example is done on a high memory node, usually access would be achieved with the scheduler
 
 
 
<pre style="background-color: black; color: white; border: 2px solid black; font-family: monospace, sans-serif;">
 
 
 
[username@c230 ~]$ module add abyss/1.5.2/gcc-4.9.3
 
[username@c230 ~]$ mpirun abyss-pe np=40
 
 
 
</pre>
 
 
 
Through '''SLURM''' this would become the script:
 
 
 
<pre style="background-color: black; color: white; border: 2px solid black; font-family: monospace, sans-serif;">
 
  
 +
<pre>
 
#!/bin/bash
 
#!/bin/bash
 
+
#SBATCH -J guppy.job
#SBATCH -J abyss
+
#SBATCH --exclusive
#SBATCH -p highmem
+
#SBATCH -o gpu.%j.out
#SBATCH -N 2
+
#SBATCH -e gpu.%j.err
#SBATCH --ntasks-per-node=40
+
#SBATCH -p gpu
#SBATCH -o %N.%j.%a.out
+
#SBATCH --gres=gpu
#SBATCH -e %N.%j.%a.err
 
 
#SBATCH --exclusive
 
#SBATCH --exclusive
#SBATCH -t 00:30:00
 
  
 
module purge
 
module purge
module add abyss/1.5.2/gcc-4.9.3
+
module load cuda/10.1.168
 
+
module load guppy/gpu/6.4.2
#Run your ABySS commands
 
  
abyss-pe name=test k=48 n=8 in='test-1.fa test-3.fa'
+
guppy_basecaller --input_path /home/user/folderIN --save_path /home/user/folderOUT --flowcell FLO-MIN106 --kit SQK-RPB004 --min_qscore 7 --qscore_filtering -x cuda:0
  
 
</pre>
 
</pre>
  
The Quantum Package has been designed pretty much as an interactive environment for quantum-chemistry calculations, in order to facilitate the user experience.
+
===Compute capability===
 
 
* qp                         
 
* qp_e_conv_fci             
 
* qp_name                   
 
* qp_reset                   
 
* qp_stop                   
 
* qpsh
 
* qp_convert_output_to_ezfio  qp_export_as_tgz           
 
* qp_plugins                 
 
* qp_set_frozen_core         
 
* qp_test
 
* qp_create_ninja           
 
* qp_mpirun                 
 
* qp_prepend_export         
 
* qp_srun                   
 
* qp_update
 
* |>
 
  
 +
For applications that require this information:
  
 +
* NVidia A40 is 8.6 (GPU01 to GPU04)
 +
* NVidia P100 is 6.0 (GPU05)
  
 
==Further Information==
 
==Further Information==
  
* [http://computing.bio.cam.ac.uk/local/doc/abyss.html http://computing.bio.cam.ac.uk/local/doc/abyss.html]
+
* https://community.nanoporetech.com/ (login required)
 
 
==Navigation==
 
  
* [[Main_Page|Home]]
+
{{Modulepagenav}}
* [[Applications|Application support]]
 
* [[General|General]]
 
* [[Programming|Programming support]]
 

Latest revision as of 12:15, 15 December 2022

Application Details

  • Description: Local accelerated base calling for Nanopore data.
  • Version: 4.2.2, 6.1.7 and 6.4.2
  • Modules: guppy/cpu/4.2.2, guppy/cpu/6.1.7 and guppy/cpu/6.4.2
  • Modules: guppy/gpu/4.2.2, guppy/gpu/6.1.7 and guppy/gpu/6.4.2
  • Licence: Free, open-source

Note: guppy version 3 is now removed from production.

Description

GUPPY is a program to visualize sequence annotation data of the genetic sequence data with a graphical layout. This highly interactive program allows smooth scrolling and zooming from the genomic landscape to discrete nucleic acid sequences. Its main function is the quick rendering of data on a personal computer or HPC to save the layout as a personal file. With the optional link to internet resources and printing support, this program tries to make greater use of computational media in research activity.


Usage Examples

Using the GPU as an accelerator will make the analysis considerably faster (on large data sets). For small runs, the CPU version is recommended and probably faster by removing the GPU overhead.


Batch example

#!/bin/bash
#SBATCH -J guppy.job
#SBATCH --exclusive
#SBATCH -o gpu.%j.out
#SBATCH -e gpu.%j.err
#SBATCH -p gpu
#SBATCH --gres=gpu
#SBATCH --exclusive

module purge
module load cuda/10.1.168
module load guppy/gpu/6.4.2

guppy_basecaller --input_path /home/user/folderIN --save_path /home/user/folderOUT --flowcell FLO-MIN106 --kit SQK-RPB004 --min_qscore 7 --qscore_filtering -x cuda:0 

Compute capability

For applications that require this information:

  • NVidia A40 is 8.6 (GPU01 to GPU04)
  • NVidia P100 is 6.0 (GPU05)

Further Information





Modules | Main Page | Further Topics