SOLiD RNA-Seq & splice-aware mapping

I’ve lost quite a lot of time trying to align color-space RNA-Seq reads. SHRiMP paper explains nicely, why it’s important to align SOLiD reads in color-space, instead of converting color-space directly into sequence-space. Below, you can find the simplest solution I have found, using tophat, relying on bowtie mapper (bowtie2 doesn’t support color-space) and color-space reads in .csfasta.

# generate genome index in color-space
bowtie-build --color GENOME.fa GENOME
 
# get SOLiD reads from SRA if you don't have them already in .csfasta
abi-dump SRR062662
 
# tophat splice-aware mapping in color-space
mkdir tophat
ref=REFERENCE_DIR/GENOME
for f in READS_DIR/*.csfasta; do
  s=`echo $f | cut -f2 -d'/' | cut -f1 -d'.'`
  if [ ! -d tophat/$s ]; then
    echo `date` $f $s
    tophat -p 4 --no-coverage-search --color -o tophat/$s --quals $ref $f READS_DIR/${s}_QV.qual
  fi
done

Leave a Reply

Your email address will not be published. Required fields are marked *