C3 Exercises
Contents
C3 Exercises#
In the previous chapter we had:#
c2e6: splicing structure (simplified for c3e1)#
a DNA sequence (from the + strand) has the next structure:
Exon1-Intron-Exon2
The first exon runs from the start of the DNA sequence (biological coordinates not array coordinates) up to the position 63.
Notes:
Be careful with your problem formulation. In this case, position 63 is the end but it is within the intron.
Do not mix up biological coordinates and array coordinates.
The second exon starts in the position 91 (biological coordinates, ending up at the end of the sequence. Considering that the whole exons code for protein (CDS: Coding Dna Sequences).
a. Print the exon sequences and their lengths, one line per sequence.
Sample#
Input:
ATCGATCGATCGATCGACTGACTAGTCATAGCTATGCATGTAGCTACTCGATCGATCGATCGATCGATCGATCGATCGATCGATCATGCTATCATCGATCGATATCGATGCATCGACTACTAT
Output a:
ATCGATCGATCGATCGACTGACTAGTCATAGCTATGCATGTAGCTACTCGATCGATCGATCG 62
ATCATCGATCGATATCGATGCATCGACTACTAT 33
c3e1: write splicing structure#
From the results from problem c2e6, write the exons in separate files, but in lower case:
File names:
“c3e1_output_exon1.txt”
“c3e1_output_exon2.txt”
Indicate to the standard output (monitor) that you have already written the files.
Sample#
Input:
ATCGATCGATCGATCGACTGACTAGTCATAGCTATGCATGTAGCTACTCGATCGATCGATCGATCGATCGATCGATCGATCGATCATGCTATCATCGATCGATATCGATGCATCGACTACTAT
Output:
...already written the file: c3e1_output_exon1.txt
...already written the file: c3e1_output_exon2.txt
"c3e1_output_exon1.txt" must contain:
atcgatcgatcgatcgactgactagtcatagctatgcatgtagctactcgatcgatcgatcg
"c3e1_output_exon1.txt" must contain:
atcatcgatcgatatcgatgcatcgactactat
c3e2: write some fasta files and a multifasta#
a Write four fasta files, with for different repeat sequences of length 10:
File names:
“c3e2_output_polyA.fa”
“c3e2_output_polyT.fa”
“c3e2_output_polyG.fa”
“c3e2_output_polyC.fa”
Indicate to the standard output (monitor) that you have already written the files.
b Write the same four sequences in a multifasta file
File names:
“c3e2_output_multifasta.fasta”
Indicate to the standard output (monitor) that you have already written the file.
Sample#
Output a:
"c3e2_output_polyA.fa" must contain:
>poly A
AAAAAAAAAA
"c3e2_output_polyT.fa" must contain:
>poly T
TTTTTTTTTT
"c3e2_output_polyG.fa" must contain:
>poly G
GGGGGGGGGG
"c3e2_output_polyC.fa" must contain:
>poly C
CCCCCCCCCC
...already written the file: c3e2_output_polyA.fa
...already written the file: c3e2_output_polyT.fa
...already written the file: c3e2_output_polyG.fa
...already written the file: c3e2_output_polyC.fa
Output b:
"c3e2_output_multifasta.fasta" must contain:
>poly A
AAAAAAAAAA
>poly T
TTTTTTTTTT
>poly G
GGGGGGGGGG
>poly C
CCCCCCCCCC
...already written the file: c3e2_output_multifasta.fasta
Note:#
“fa” or “fasta” are both extension for fasta files
Tip:#
be aware of the end of lines: “\n”