7 Control Flow Statements¶

7.1 Exercises¶

7.1.1 Exercise¶

The Fibonacci sequence are the numbers in the following integer sequence:

0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, …

By definition, the first two numbers in the Fibonacci sequence are 0 and 1, and each subsequent number is the sum of the previous two. The fibonacci suite can be defined as following:

F0 = 0, F1 = 1.

Fn = Fn-1 + Fn-2

Write a function which take an integer n as parameter and returns a list containing the n first number of the Fibonacci sequence.

def fibonacci(n):
    fib_suite = []
    i = 0
    while i <= n:
        if i == 0:
            fib_suite.append(0)
        elif i == 1:
            fib_suite.append(1)
        else: 
            res = fib_suite[i-1] + fib_suite[i-2] 
            fib_suite.append(res)
        i += 1
    return fib_suite

print(', '.join([str(i) for i in fibonacci(10)]))

fibonacci_iteration.py . We will see another way more elegant to implement the fibonacci suite in Advanced Programming Techniques section.

7.1.2 Exercise¶

Reimplement your own function max (my_max). This function will take a list or tuple of float or integer and returns the largest element?

Write the pseudocode before to propose an implementation.

7.1.2.1 pseudocode¶

function my_max(l)

max <- first elt of l

for each elts of l

if elt is > max

max <- elt

return max

7.1.2.2 implementation¶

def my_max(seq):
   """
   return the maximum value in a sequence
   work only with integer or float
   """
   higest = seq[0]
   for i in seq:
      if i > highest:
          highest = i
   return highest

l = [1,2,3,4,58,9]
print(my_max(l))
58

7.1.3 Exercise¶

We want to establish a restriction map of a sequence.
But we will do this step by step.
and reuse the enzymes used in previous chapter:

create a function that take a sequence and an enzyme as parameter and return

the position of first binding sites. (write the pseudocode)

pseudocode

function one_enz_binding_site(dna, enzyme)

if enzyme binding site is substring of dna

return of first position of substring in dna

implementation

from operator import itemgetter


def one_enz_binding_site(dna, enzyme):
    """
    return the first position of enzyme binding site in dna
    or None if there is not
    """
    pos = dna.find(enzyme.sequence)
    if pos != -1:
        return pos

improve the previous function to return all positions of binding sites

pseudocode of first algorithm

function one_enz_binding_sites(dna, enzyme)

positions <- empty

if enzyme binding site is substring of dna

add the position of the first substring in dna in positions

positions <- find binding_sites in rest of dna sequence

return positions

implementation

def one_enz_binding_sites1(dna, enzyme):
    """
    return all positions of enzyme binding sites in dna
    """
    positions = []
    pos = dna.find(enzyme.sequence)
    if pos != -1:
        positions.append(pos)
        positions.extend(one_enz_binding_sites1(dna[pos+1:], enzyme))
    return positions
        
        

pseudocode of second algorithm

function one_enz_binding_sites(dna, enzyme)

positions <- empty

find first position of binding site in dna

while we find binding site in dna

add position of binding site to positions

find first position of binding site in dna in rest of dna

return positions

implementation

        
def one_enz_binding_sites(dna, enzyme):
    """
    return all positions of enzyme binding sites in dna
    """
    positions = []
    pos = dna.find(enzyme.sequence)
    while pos != -1:
        positions.append(pos)
        pos = dna.find(enzyme.sequence, pos + 1)
    return positions

search all positions of Ecor1 binding sites in dna_1

import collections
RestrictEnzyme = collections.namedtuple("RestrictEnzyme", "name comment sequence cut end")

ecor1 = RestrictEnzyme("EcoRI", "Ecoli restriction enzime I", "gaattc", 1, "sticky")

dna_1 = """tcgcgcaacgtcgcctacatctcaagattcagcgccgagatccccgggggttgagcgatccccgtcagttggcgtgaattcag
cagcagcgcaccccgggcgtagaattccagttgcagataatagctgatttagttaacttggatcacagaagcttccaga
ccaccgtatggatcccaacgcactgttacggatccaattcgtacgtttggggtgatttgattcccgctgcctgccagg"""

generalize the binding sites function to take a list of enzymes and return a list of tuple (enzyme name, position)

pseudocode

function binding_sites(dna, set of enzymes)

positions <- empty

for each enzyme in enzymes

pos <- one_enz_binding_sites(dna, enzyme)
pos <- for each position create a tuple enzyme name, position
positions <- pos

return positions

implementation

in bonus we can try to sort the list in the order of the position of the binding sites like this: [(‘Sau3aI’, 38), (‘SmaI’, 42), (‘Sau3aI’, 56), (‘EcoRI’, 75), …

def binding_sites(dna, enzymes):
    """
    return all positions of all enzymes binding sites present in dna
    sort by the incresing position
    """
    positions = []
    for enzyme in enzymes:
        pos = one_enz_binding_sites(dna, enzyme)
        pos = [(enzyme.name, pos) for pos in pos]
        positions.extend(pos)
    positions.sort(key = itemgetter(1))
    return positions 

import collections
RestrictEnzyme = collections.namedtuple("RestrictEnzyme", ("name", "comment", "sequence", "cut", "end")

ecor1 = RestrictEnzyme("EcoRI", "Ecoli restriction enzime I", "gaattc", 1, "sticky")
ecor5 = RestrictEnzyme("EcoRV", "Ecoli restriction enzime V", "gatatc", 3, "blunt")
bamh1 = RestrictEnzyme("BamHI", "type II restriction endonuclease from Bacillus amyloliquefaciens ", "ggatcc", 1, "sticky")
hind3 = RestrictEnzyme("HindIII", "type II site-specific nuclease from Haemophilus influenzae", "aagctt", 1 , "sticky")
taq1 = RestrictEnzyme("TaqI", "Thermus aquaticus", "tcga", 1 , "sticky")
not1 = RestrictEnzyme("NotI", "Nocardia otitidis", "gcggccgc", 2 , "sticky")
sau3a1 = RestrictEnzyme("Sau3aI", "Staphylococcus aureus", "gatc", 0 , "sticky")
hae3 = RestrictEnzyme("HaeIII", "Haemophilus aegyptius", "ggcc", 2 , "blunt")
sma1 =  RestrictEnzyme("SmaI", "Serratia marcescens", "cccggg", 3 , "blunt")

and the 2 dna fragments:

dna_1 = """tcgcgcaacgtcgcctacatctcaagattcagcgccgagatccccgggggttgagcgatccccgtcagttggcgtgaattcag
cagcagcgcaccccgggcgtagaattccagttgcagataatagctgatttagttaacttggatcacagaagcttccaga
ccaccgtatggatcccaacgcactgttacggatccaattcgtacgtttggggtgatttgattcccgctgcctgccagg"""

dna_2 = """gagcatgagcggaattctgcatagcgcaagaatgcggccgcttagagcgatgctgccctaaactctatgcagcgggcgtgagg
attcagtggcttcagaattcctcccgggagaagctgaatagtgaaacgattgaggtgttgtggtgaaccgagtaag
agcagcttaaatcggagagaattccatttactggccagggtaagagttttggtaaatatatagtgatatctggcttg"""

enzymes= (ecor1, ecor5, bamh1, hind3, taq1, not1, sau3a1, hae3, sma1)
binding_sites(dna_1, enzymes)
[('Sau3aI', 38), ('SmaI', 42), ('Sau3aI', 56), ('EcoRI', 75), ('SmaI', 95), ('EcoRI', 105),
('Sau3aI', 144), ('HindIII', 152), ('BamHI', 173), ('Sau3aI', 174), ('BamHI', 193), ('Sau3aI', 194)]

binding_sites(dna_2, enzymes)
[('EcoRI', 11), ('NotI', 33), ('HaeIII', 35), ('EcoRI', 98), ('SmaI', 106),
('EcoRI', 179), ('HaeIII', 193), ('EcoRV', 225)]

restriction.py .

7.1.4 Exercise¶

From a list return a new list without any duplicate, but keeping the order of items. For example:

>>> l = [5,2,3,2,2,3,5,1]
>>> uniqify_with_order(l)
>>> [5,2,3,1]

solution

>>> uniq = []
>>> for item in l:
>>>   if item not in uniq:
>>>      uniq.append(item)

solution

>>> uniq_items = set()
>>> l_uniq = [x for x in l if x not in uniq_items and not uniq_items.add(x)]

7 Control Flow Statements¶

7.1 Exercises¶

7.1.1 Exercise¶

7.1.2 Exercise¶

7.1.2.1 pseudocode¶

7.1.2.2 implementation¶

7.1.3 Exercise¶

7.1.4 Exercise¶

Table Of Contents

Previous topic

Next topic

This Page