2015-11-22

Bowtie index building problem for U's in sequences

I am having a weird problem with Bowtie index builder.

I have two entries in my FASTA file:
forrest@narnia:/bioinfo/out-house/miRBase$ cat 157.fa
>ath-miR157a-5p-U
UUGACAGAAGAUAGAGAGCAC
>ath-miR157a-5p_T
TTGACAGAAGATAGAGAGCAC

After building the index, I use bowtie-inspector to check the index.
forrest@narnia:/bioinfo/out-house/miRBase$ bowtie-build 157.fa  157 -q
forrest@narnia:/bioinfo/out-house/miRBase$ bowtie-inspect -s 157
Colorspace 0
SA-Sample 1 in 32
FTab-Chars 10
Sequence-1 ath-miR157a-5p-U 18
Sequence-2 ath-miR157a-5p_T 21

Strangely, the length of ath-miR157a-5p-U becomes 18 instead of 21. The 3 U's of it are missed.

Even more strangely, not all U's in all sequences are ignored. This problem happens for some but not all.

No comments: