Difference between revisions of "Applications/Trinityrnaseq"
From HPC
m |
m (Pysdlb moved page Trinityrnaseq to Applications/Trinityrnaseq without leaving a redirect) |
(No difference)
|
Revision as of 09:40, 19 April 2017
Application Details
- Description: Trinity assembles transcript sequences from Illumina RNA-Seq data.
- Version: 2.2.0 (compiled with 2.2.0)
- Modules: trinityrnaseq/gcc/2.2.0
- Licence: Github, open-source
Usage
This represents a novel method for the efficient and robust de novo reconstruction of transcriptomes from RNA-seq data. Trinity combines three independent software modules: Inchworm, Chrysalis, and Butterfly, applied sequentially to process large volumes of RNA-seq reads. Trinity partitions the sequence data into many individual de Bruijn graphs, each representing the transcriptional complexity at a given gene or locus, and then processes each graph independently to extract full-length splicing isoforms and to tease apart transcripts derived from paralogous genes. Briefly, the process works like so:
- Inchworm assembles the RNA-seq data into the unique sequences of transcripts, often generating full-length transcripts for a dominant isoform, but then reports just the unique portions of alternatively spliced transcripts.
- Chrysalis clusters the Inchworm contigs into clusters and constructs complete de Bruijn graphs for each cluster. Each cluster represents the full transcriptonal complexity for a given gene (or sets of genes that share sequences in common). Chrysalis then partitions the full read set among these disjoint graphs.
- Butterfly then processes the individual graphs in parallel, tracing the paths that reads and pairs of reads take within the graph, ultimately reporting full-length transcripts for alternatively spliced isoforms, and teasing apart transcripts that corresponds to paralogous genes.
Assemble RNA-Seq data
[username@login] module add trinityrnaseq/gcc/2.2.0 [username@login] Trinity --seqType fq --left reads_1.fq --right reads_2.fq --CPU 6 --max_memory 20G
Find assembled transcripts as: 'trinity_out_dir/Trinity.fasta'
A typical script would be:
#!/bin/bash if [ -e reads.right.fq.gz ] && [ ! -e reads.right.fq ]; then gunzip -c reads.right.fq.gz > reads.right.fq fi if [ -e reads.left.fq.gz ] && [ ! -e reads.left.fq ]; then gunzip -c reads.left.fq.gz > reads.left.fq fi if [ -e reads2.right.fq.gz ] && [ ! -e reads2.right.fq ]; then gunzip -c reads2.right.fq.gz > reads2.right.fq fi if [ -e reads2.left.fq.gz ] && [ ! -e reads2.left.fq ]; then gunzip -c reads2.left.fq.gz > reads2.left.fq fi ####################################################### ## Run Trinity to Generate Transcriptome Assemblies ## ####################################################### Trinity --seqType fq --max_memory 2G --left reads.left.fq.gz,reads2.left.fq.gz --right reads.right.fq.gz,reads2.right.fq.gz --SS_lib_type RF --CPU 4 --no_cleanup --normalize_reads ##### Done Running Trinity ##### if [ ! $* ]; then exit 0 fi