Applications/guppy
Contents
Application Details
- Description: ABySS is a de novo sequence assembler intended for short paired-end reads and large genomes.
- Version: 1.5.2 (gcc-4.9.3)
- Modules: abyss/1.5.2/gcc-4.9.3
- Licence: Free, open-source
Usage Examples
Assemble a small synthetic data set
[username@login] module load abyss/1.5.2/gcc-4.9.3 [username@login] wget http://www.bcgsc.ca/platform/bioinfo/software/abyss/releases/1.3.4/test-data.tar.gz [username@login] tar xzvf test-data.tar.gz [username@login} abyss-pe k=25 name=test \ in='test-data/reads1.fastq test-data/reads2.fastq'
Calculate assembly contiguity statistics
[username@login] module load abyss/1.5.2/gcc-4.9.3 [username@login] abyss-fac test-unitigs.fa
</pre>
Parallel processing
The np option of abyss-pe specifies the number of processes to use for the parallel MPI job. Without any MPI configuration, this will allow you to use multiple cores on a single machine. To use multiple machines for assembly, you must create a hostfile for mpirun, which is described in the mpirun man page.
Do not run mpirun -np 8 abyss-pe. To run ABySS with 8 threads, use abyss-pe np=8. The abyss-pe driver script will start the MPI process, like so: mpirun -np 8 ABYSS-P.
The paired-end assembly stage is multithreaded, but must run on a single machine. The number of threads to use may be specified with the parameter j. The default value for j is the value of np.
Note: this example is done on a high memory node, usually access would be achieved with the scheduler
[username@c230 ~]$ module add abyss/1.5.2/gcc-4.9.3 [username@c230 ~]$ mpirun abyss-pe np=40
Through SLURM this would become the script:
#!/bin/bash #SBATCH -J abyss #SBATCH -p highmem #SBATCH -N 2 #SBATCH --ntasks-per-node=40 #SBATCH -t 00:30:00 module purge module load abyss/1.5.2/gcc-4.9.3 #Run your ABySS commands abyss-pe name=test k=48 n=8 in='test-1.fa test-3.fa'