Applications/guppy

From HPC
Revision as of 15:03, 3 April 2017 by Pysdlb (talk | contribs) (Created page with "==Application Details== * Description: ABySS is a de novo sequence assembler intended for short paired-end reads and large genomes. * Version: 1.5.2 (gcc-4.9.3) * Modules: ab...")

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation , search

Application Details

  • Description: ABySS is a de novo sequence assembler intended for short paired-end reads and large genomes.
  • Version: 1.5.2 (gcc-4.9.3)
  • Modules: abyss/1.5.2/gcc-4.9.3
  • Licence: Free, open-source

Usage Examples

Assemble a small synthetic data set


[username@login] module load abyss/1.5.2/gcc-4.9.3
[username@login] wget http://www.bcgsc.ca/platform/bioinfo/software/abyss/releases/1.3.4/test-data.tar.gz
[username@login] tar xzvf test-data.tar.gz
[username@login} abyss-pe k=25 name=test \
    in='test-data/reads1.fastq test-data/reads2.fastq'

Calculate assembly contiguity statistics

[username@login] module load abyss/1.5.2/gcc-4.9.3 [username@login] abyss-fac test-unitigs.fa

</pre>

Parallel processing

The np option of abyss-pe specifies the number of processes to use for the parallel MPI job. Without any MPI configuration, this will allow you to use multiple cores on a single machine. To use multiple machines for assembly, you must create a hostfile for mpirun, which is described in the mpirun man page.

Do not run mpirun -np 8 abyss-pe. To run ABySS with 8 threads, use abyss-pe np=8. The abyss-pe driver script will start the MPI process, like so: mpirun -np 8 ABYSS-P.

The paired-end assembly stage is multithreaded, but must run on a single machine. The number of threads to use may be specified with the parameter j. The default value for j is the value of np.

Note: this example is done on a high memory node, usually access would be achieved with the scheduler


[username@c230 ~]$ module add abyss/1.5.2/gcc-4.9.3
[username@c230 ~]$ mpirun abyss-pe np=40

Through SLURM this would become the script:


#!/bin/bash
  
#SBATCH -J abyss
#SBATCH -p highmem
#SBATCH -N 2
#SBATCH --ntasks-per-node=40
#SBATCH -t 00:30:00

module purge
module load abyss/1.5.2/gcc-4.9.3

#Run your ABySS commands

abyss-pe name=test k=48 n=8 in='test-1.fa test-3.fa'

Further Information