annotate mRNA and gDNA seq, Identify 5' and 3' untraslated r

Use this forum for general bioinformatics questions, or questions regarding genomics, proteomics, etc.

Moderators: Abhijeet Bakre, mdfenko, strom

annotate mRNA and gDNA seq, Identify 5' and 3' untraslated r

Postby gene26 » Jul 05 2012 4:32 am

Hi, i was given to do these but i never did this before can anyone help how to do these step by step...



You have been provided with the Mouse Villin protein, mRNA and genomic DNA sequences.

The target promoter sequence is 3.5kb upstream of transcription start site

Annotate the mRNA and gDNA sequences to show the start and stop codons (put these in red BOLD text)

Identify the 5’ and 3’ untranslated regions (UTRs) – highlight these in pink

Identify the transcription start site

Highlight the 3.5kb region upstream of the transcription start site in yellow

Identify the binding sites of the Villin F1, R1, F2 and R2 primers (underline these)



Villin mRNA sequence:

>NM_009509 Mus musculus villin 1 (Vil1) mRNA
GCTTGCCACAACTTCCTAAGATCTCCCAGGTGGTGGCTGCCTCTTCCAGACAGGCTCGTCCACCATGACTAAACTGAATGCCCAAGTCAAAGGCTCTCTCAACATCACCACTCCCGGGATACAGATATGGAGGATCGAGGCTATGCAGATGGTACCTGTTCCTTCCAGCACCTTTGGAAGCTTCTTCGATGGTGACTGCTATGTAGTCCTGGCTATCCACAAGACCAGCAGCACTCTCTCCTATGATATCCACTACTGGATTGGCCAGGACTCGTCCCAGGATGAGCAGGGGGCAGCTGCCATCTACACCACACAGATGGATGACTACCTGAAGGGCCGGGCTGTCCAGCACCGCGAGGTTCAAGGCAACGAGAGCGAGACTTTCCGGAGCTACTTCAAGCAAGGCCTTGTGATCCGGAAAGGGGGAGTGGCTTCCGGCATGAAGCACGTAGAAACAAACTCCTGTGATGTCCAGCGACTGTTGCACGTCAAGGGCAAGAGGAATGTGCTGGCTGGAGAGGTGGAAATGTCCTGGAAGAGTTTCAACAGAGGGGATGTCTTCCTGCTGGACCTTGGGAAGCTTATTATCCAGTGGAATGGGCCAGAGAGTAACCGCATGGAGAGACTTCGGGGCATGGCCTTGGCCAAAGAGATCCGAGACCAGGAACGGGGTGGACGTACCTACGTAGGTGTGGTGGACGGGGAGAAGGAAGGGGACTCCCCACAGCTGATGGCAATTATGAACCACGTGCTGGGCCCACGCAAAGAACTGAAGGCTGCTATTTCTGACTCAGTGGTGGAGCCGGCCGCTAAGGCTGCACTCAAGCTGTACCATGTGTCTGACTCCGAAGGAAAACTGGTGGTTAGAGAAGTTGCTACTCGGCCACTCACACAAGACCTGCTCAAGCATGAGGACTGTTACATCCTGGACCAGGGAGGCCTGAAGATCTTTGTGTGGAAGGGAAAAAATGCCAACGCTCAGGAGAGGAGCGGAGCCATGAGCCAGGCCTTGAACTTCATCAAAGCCAAGCAGTACCCACCGAGCACGCAGGTTGAGGTGCAGAATGACGGGGCCGAGTCCCCCATCTTCCAACAACTCTTCCAGAAGTGGACAGTGCCCAACCGGACCTCAGGCCTCGGCAAAACCCACACTGTGGGCTCTGTGGCTAAGGTGGAACAGGTGAAGTTTGATGCTCTGACCATGCATGTACAACCTCAGGTGGCTGCCCAGCAGAAAATGGTGGATGACGGGAGTGGGGAAGTGCAGGTGTGGCGCATCGAGGACTTGGAGCTGGTGCCTGTGGAGTCCAAGTGGCTGGGCCATTTCTACGGTGGTGACTGCTACCTGCTGCTCTACACCTACCTCATAGGGGAAAAGCAACACTACTTGTTATATATCTGGCAGGGCAGCCAGGCCAGCCAGGATGAAATTGCAGCCTCGGCGTATCAAGCCGTCCTGTTGGACCAGAAGTACAATGACGAGCCAGTACAGATCCGGGTCACGATGGGCAAGGAGCCGCCTCACCTCATGTCTATCTTCAAGGGCCGCATGGTGGTTTATCAGGGAGGCACCTCCCGAAAGAACAACTTGGAGCCTGTGCCCTCTACGAGGCTATTTCAGGTCCGAGGGACCAATGCTGATAACACCAAGGCTTTTGAGGTGACAGCCCGGGCCACGTCCCTCAACTCCAATGATGTCTTCATACTCAAGACTCCGTCCTGCTGCTACCTGTGGTGTGGGAAGGGCTGCAGTGGGGATGAGAGGGAGATGGCCAAGATGGTTGCTGATACCATCTCTCGGACGGAGAAACAAGTGGTAGTAGAGGGGCAGGAGCCAGCCAACTTCTGGATGGCTCTGGGCGGGAAGGCGCCCTACGCCAACACCAAGAGGCTGCAGGAGGAAAACCAAGTCATCACTCCTCGGCTCTTCGAGTGCTCCAACCAGACCGGACGCTTTCTGGCCACAGAGATCTTTGACTTCAATCAGGATGACCTGGAGGAGGAGGATGTGTTCCTATTGGATGTCTGGGACCAGGTCTTCTTCTGGATAGGGAAACATGCCAATGAGGAAGAGAAGAAGGCTGCAGCTACAACTGTACAAGAATACCTCAAGACCCACCCTGGAAACCGAGACCTTGAGACCCCTATCATCGTGGTGAAGCAGGGACACGAGCCCCCCACCTTCACAGGCTGGTTCCTGGCTTGGGATCCCTTCAAGTGGAGTAACACCAAATCCTATGATGACCTTAAGGCAGAGCTGGGAAACTCTGGGGACTGGAGCCAGATTGCTGACGAGGTTATGAGCCCGAAAGTGGACGTTTTCACTGCCAATACCAGTCTGAGTTCTGGGCCCCTGCCCACCTTCCCCCTGGAGCAGCTGGTAAACAAGTCTGTAGAGGATCTCCCTGAGGGTGTGGACCCCAGCAGGAAGGAGGAGCACCTGTCCACCGAAGACTTCACTAGGGCCTTGGGCATGACTCCAGCTGCCTTCTCTGCCCTGCCTCGATGGAAGCAACAAAACATCAAGAAAGAAAAAGGACTGTTTTGAGAATTGAAGCTCTCTGGCTGTCCAGCAGCCCCTACCCTGCCTTCAAGGGCTTTGTGCCGCCATTACTGGTTTTAGTCCTGTGGCAGATGAAAATGTCCAATTGTACCTGTGAGCCACAGTGTGACAATTCCTTTTGTTTATAATAGTAATTTGCCCATTCCTTCAGACGCATGCCACAGACCCATGGAAATCTTGTAGAGTTTTCTTTTCTTAGATGGACAGCTAAGTACTCCAGGAGACATTAGCGTCTGGGGGTTTCTCTGGCACCGTCACTCACTCAGGATCTTATCCTGATCTTACCCTCCTCACACTCAAAAGGGAGGGCTAAGGCCAAAGCTGGGCTTACAGCTCTAACCCAGAGCCTTTGCAAAGCTCCACAGACTCCTCAAATGACAACACCAGGAACATGGGTTTGCTACGTGAAGTCCAATCAGAAGCCAATAGGTGATTTTCTCTTAAAACTGGTTATCCAGTGTCCCCAGGAACATGTCCCTTTAAACAAATAAATCAAACTAATATGAGGTTAATAAAGGCTTTAATGTCTCTCACACATTAAAACAAAACAAAAC
gene26
technophile
technophile
 
Posts: 36
Joined: Jul 01 2012 5:14 pm

Re: annotate mRNA and gDNA seq, Identify 5' and 3' untraslat

Postby Paddywhacker » Jul 08 2012 2:16 am

How you go about it depends very much on what software that you are using, but you can do all of it in a text editor.

The position of the primers is easiest, as you just search the RNA for the primer sequence. If you cannot find it then reverse-complement the primer sequence and search again. Searching the DNA sequence will be similar.

Next, find the coding region within the RNA. Translate part of the start of your protein RNA triplet code. As this is redundant (an amino acid may have a choice of several triplet codes) write down all of the alternative triplets underneath each other. The sequence will probably start with ATG, the triplet codon for methionine. Then search your RNA sequence. The position you find will be the start of coding and before that point will be the 5' UTR. Do the same for the end of your protein. The 3' UTR is the part of the RNA sequence after that.

Now you have your protein-generating code marked up on your RNA.

The next thing is to find that code within the DNA. There may be introns in the DNA introducing gaps, but you can cope with that.

Oh, duh, you only need to annotate the start and stop codons, so ignore that intron stuff if you like, but don't forget that the stop codon is the one after the end of the protein coding. And the transcription start site has to be the start of the 5' UTR.

You should have been told all of this in your introductory lectures. Were you asleep? :wink:

Anyway, once you can do it manually, using software such as Geneious is much more meaningful, because you know what it means.
Paddywhacker
technician-in-training
technician-in-training
 
Posts: 10
Joined: Jul 01 2011 5:12 am
Location: Auckland, New Zealand


Return to General Bioinformatics

Who is online

Users browsing this forum: No registered users and 4 guests