Confused with the blast result

Use this forum for general bioinformatics questions, or questions regarding genomics, proteomics, etc.

Moderators: Abhijeet Bakre, mdfenko, strom

Confused with the blast result

Postby JanChang » Jun 14 2012 10:25 am

A newbie question pertaining a mitochondrial cytochrome c oxidase subunit I gene (COI) blast result. I performed blastx against the non-redundant database. The alignment picture is shown below.
Image
Then I took a look at the first homologous gene shown below.
Image

1. Does the result mean that my retrieved COI sequence got some errors so couldn't be translated into a complete protein without stop codon(s)? (I did choose the invertebrate mitochondrial code table) However, I sequenced several different individuals (of the same species) and had consistent results.
2. What are the gray lines in the figure? A gap or intron? Could this be the reason of different frames (+1, +2 and +3) in the comparison result? In addition, what are the short vertical lines in the figure (i.e., the two at about 50 and 745bp)?
3. I'm always curious about the lower case letters in the alignment result. Are they representing known motifs or what?
Thank you very much!
JanChang
technophile
technophile
 
Posts: 25
Joined: Apr 17 2010 2:43 am

Re: Confused with the blast result

Postby Astarte » Jun 21 2012 3:41 am

1) The blastx compares all 6 reading frames of your protein to the database. You did not get a 100% hit, meaning that the protein of your gene of your species is not present yet in the database, but you are getting high hits to cytochrome c of other species. There are indeed some gaps and frameshifts, but this is not that unusual, if you would compare both sequences, you will probably see some insertions or deletions between the sequences leading to frameshifts. Of course, if this frame no longer corresponds to a translated part in your sequence, it is possible that stop codons are present. If you are less interested in the phylogenetic relationships of your seqeunce with other sequecenes and more interested in finding the same functional protein, it might be interesting to do a protein blast (especially if you know your frame).
2) yes, the gray lines are larger gaps, the short vertical lines represent a frameshift.
3) this may be motifs, generally they represent low complexity sequences, these occur in many sequences and may artificially increase your score. Therefore, they are masked.

Personllay, I prefer blast or blastx to identify similar genes or proteins and then I align and compare them in a different program such as clustalW.
Astarte
Prolific Post-Master
Prolific Post-Master
 
Posts: 118
Joined: Nov 13 2007 7:41 am
Location: Belgium

Re: Confused with the blast result

Postby Bluewoodtree » Feb 08 2013 10:29 pm

I can also recommend a PSIBlast. Basically, this is a simple pBLAST on it's first run, but then it creates an alignment of the best hits which is used in the next run and so forth. It is a great tool to discover more distantly related proteins.

http://www.ncbi.nlm.nih.gov/books/NBK2590/
Bluewoodtree
newcomer
newcomer
 
Posts: 2
Joined: Feb 08 2013 10:10 pm


Return to General Bioinformatics

Who is online

Users browsing this forum: No registered users and 2 guests