7 Control Flow Statements¶
7.1 Exercises¶
7.1.1 Exercise¶
The Fibonacci sequence are the numbers in the following integer sequence:
0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, …
By definition, the first two numbers in the Fibonacci sequence are 0 and 1, and each subsequent number is the sum of the previous two. The Fibonacci suite can be defined as following:
Write a function that takes an integer n
as parameter
and returns a list containing the n
first numbers of the Fibonacci sequence.
1
2def fibonacci(n):
3 """
4 compute the suite of fibonacci for n element and return it
5 for instance ::
6
7 fibonacci(7)
8 [0, 1, 1, 2, 3, 5, 8, 13, 21]
9
10 :param int n: the nth elemnt of the fibonacci suite
11 :return: the nfirst elt of the fibonacci suite
12 :rtype: list of int
13 """
14 fib_suite = []
15 i = 0
16 while i <= n:
17 if i == 0:
18 fib_suite.append(0)
19 elif i == 1:
20 fib_suite.append(1)
21 else:
22 res = fib_suite[-1] + fib_suite[-2]
23 fib_suite.append(res)
24 i += 1
25 return fib_suite
26
27print(', '.join([str(i) for i in fibonacci(10)]))
28
29
30def fibonacci_2(n):
31 """
32 compute the fibonacci of the elt n
33 :param int n:
34 :return: the fibonacci of the elt n
35 :rtype: int
36 """
37 a = 0
38 b = 1
39 for i in range(n):
40 a, b = b, a + b
41 return b
42
fibonacci_iteration.py
.
We will see another way more elegant to implement the Fibonacci suite in Advanced Programming Techniques section.
7.1.2 Exercise¶
Reimplement your own function max
(call it my_max
).
This function will take a list or tuple of floats or integers and
return the largest element.
Write the pseudocode before proposing an implementation.
7.1.2.1 pseudocode¶
7.1.2.2 implementation¶
def my_max(seq):
"""
return the maximum value in a sequence
work only with integer or float
"""
highest = seq[0]
for i in seq:
if i > highest:
highest = i
return highest
l = [1, 2, 3, 4, 58, 9]
print(my_max(l))
58
7.1.3 Exercise¶
We want to create a “restriction map” of two sequences.
Create the following enzymes:
ecor1 = ("EcoRI", "Ecoli restriction enzime I", "gaattc", 1, "sticky")
ecor5 = ("EcoRV", "Ecoli restriction enzime V", "gatatc", 3, "blunt")
bamh1 = ("BamHI", "type II restriction endonuclease from Bacillus amyloliquefaciens", "ggatcc", 1, "sticky")
hind3 = ("HindIII", "type II site-specific nuclease from Haemophilus influenzae", "aagctt", 1, "sticky")
taq1 = ("TaqI", "Thermus aquaticus", "tcga", 1, "sticky")
not1 = ("NotI", "Nocardia otitidis", "gcggccgc", 2, "sticky")
sau3a1 = ("Sau3aI", "Staphylococcus aureus", "gatc", 0, "sticky")
hae3 = ("HaeIII", "Haemophilus aegyptius", "ggcc", 2, "blunt")
sma1 = ("SmaI", "Serratia marcescens", "cccggg", 3, "blunt")
Then create the following two DNA fragments:
dna_1 = """tcgcgcaacgtcgcctacatctcaagattcagcgccgagatccccgggggtt
gagcgatccccgtcagttggcgtgaattcagcagcagcgcaccccgggcgtagaattccagtt
gcagataatagctgatttagttaacttggatcacagaagcttccagaccaccgtatggatccc
aacgcactgttacggatccaattcgtacgtttggggtgatttgattcccgctgcctgccagg"""
dna_2 = """gagcatgagcggaattctgcatagcgcaagaatgcggccgcttagagcgatg
ctgccctaaactctatgcagcgggcgtgaggattcagtggcttcagaattcctcccgggagaa
gctgaatagtgaaacgattgaggtgttgtggtgaaccgagtaagagcagcttaaatcggagag
aattccatttactggccagggtaagagttttggtaaatatatagtgatatctggcttg"""
In a file <my_file.py>
Create a function one_line_dna that transforms a multi-line sequence into a single-line DNA sequence.
Create a collection containing all enzymes
Create a function that takes two parameters:
a sequence of DNA
a list of enzyme
and returns a collection containing the enzymes which cut the DNA.
Which enzymes cut:
dna_1
?dna_2
?dna_1
but notdna_2
?
7.1.4 Exercise¶
We want to establish a restriction map of a sequence. But we will do this step by step, and reuse the enzymes used in the previous exercise:
Create a function that takes a sequence and an enzyme as parameters, and returns the position of the first binding site. (Write the pseudocode.)
pseudocode
implementation
1from operator import itemgetter
2
3# we decide to implement enzyme as tuple with the following structure
4# (name, comment, sequence, cut, end)
5# 0 1 2 3 4
6
7
8def one_enz_one_binding_site(dna, enzyme):
9 """
10 :return: the first position of enzyme binding site in dna or None if there is not
11 :rtype: int or None
12 """
13 pos = dna.find(enzyme[2])
14 if pos != -1:
15 return pos
16
Improve the previous function to return all positions of binding sites.
pseudocode of first algorithm
implementation
1
2def one_enz_all_binding_sites(dna, enzyme):
3 """
4 :param dna: the dna sequence to search enzyme binding sites
5 :type dna: str
6 :param enzyme: the enzyme to looking for
7 :type enzyme: a tuple (str name, str comment, str sequence, int cut, str end)
8 :return: all positions of enzyme binding sites in dna
9 :rtype: list of int
10 """
11 positions = []
12 pos = dna.find(enzyme[2])
13 while pos != -1:
14 positions.append(pos)
15 pos = dna.find(enzyme[2], pos + 1)
16 return positions
17
pseudocode of second algorithm
implementation
1
2def one_enz_all_binding_sites2(dna, enzyme):
3 """
4
5 :param dna: the dna sequence to search enzyme binding sites
6 :type dna: str
7 :param enzyme: the enzyme to looking for
8 :type enzyme: a tuple (str name, str comment, str sequence, int cut, str end)
9 :return: all positions of enzyme binding sites in dna
10 :rtype: list of int
11 """
12 positions = []
13 pos = dna.find(enzyme[2])
14 while pos != -1:
15 if positions:
16 positions.append(pos)
17 else:
18 positions = pos + positions[-1]
19 new_seq = dna[pos + 1:]
20 pos = new_seq.find(enzyme[2])
21 pos = pos
22 return positions
23
Search all positions of Ecor1 binding sites in
dna_1
.
ecor1 = ("EcoRI", "Ecoli restriction enzime I", "gaattc", 1, "sticky")
dna_1 = """tcgcgcaacgtcgcctacatctcaagattcagcgccgagatccccgggggttgagcgatccccgtcagttggcgtgaattcag
cagcagcgcaccccgggcgtagaattccagttgcagataatagctgatttagttaacttggatcacagaagcttccaga
ccaccgtatggatcccaacgcactgttacggatccaattcgtacgtttggggtgatttgattcccgctgcctgccagg"""
Generalize the binding sites function to take a list of enzymes and return a list of tuples (enzyme name, position).
pseudocode
implementation
In bonus, we can try to sort the list in the order of the position of the binding sites like this:
[('Sau3aI', 38), ('SmaI', 42), ('Sau3aI', 56), ('EcoRI', 75), ...
1
2def binding_sites(dna, enzymes):
3 """
4 return all positions of all enzymes binding sites present in dna
5 sort by the increasing position.
6
7 :param dna: the dna sequence to search enzyme binding sites
8 :type dna: str
9 :param enzymes: the enzymes to looking for
10 :type enzymes: a sequence of enzyme tuple
11 :return: all positions of each enzyme binding sites in dna
12 :rtype: list of int
13 """
14 positions = []
15 for enzyme in enzymes:
16 pos = one_enz_all_binding_sites(dna, enzyme)
17 pos = [(enzyme[0], pos) for pos in pos]
18 positions.extend(pos)
19 positions.sort(key=itemgetter(1))
20 return positions
21
ecor1 = ("EcoRI", "Ecoli restriction enzime I", "gaattc", 1, "sticky")
ecor5 = ("EcoRV", "Ecoli restriction enzime V", "gatatc", 3, "blunt")
bamh1 = ("BamHI", "type II restriction endonuclease from Bacillus amyloliquefaciens ", "ggatcc", 1, "sticky")
hind3 = ("HindIII", "type II site-specific nuclease from Haemophilus influenzae", "aagctt", 1 , "sticky")
taq1 = ("TaqI", "Thermus aquaticus", "tcga", 1 , "sticky")
not1 = ("NotI", "Nocardia otitidis", "gcggccgc", 2 , "sticky")
sau3a1 = ("Sau3aI", "Staphylococcus aureus", "gatc", 0 , "sticky")
hae3 = ("HaeIII", "Haemophilus aegyptius", "ggcc", 2 , "blunt")
sma1 = ("SmaI", "Serratia marcescens", "cccggg", 3 , "blunt")
and the two dna fragments:
dna_1 = """tcgcgcaacgtcgcctacatctcaagattcagcgccgagatccccgggggttgagcgatccccgtcagttggcgtgaattcag
cagcagcgcaccccgggcgtagaattccagttgcagataatagctgatttagttaacttggatcacagaagcttccaga
ccaccgtatggatcccaacgcactgttacggatccaattcgtacgtttggggtgatttgattcccgctgcctgccagg"""
dna_2 = """gagcatgagcggaattctgcatagcgcaagaatgcggccgcttagagcgatgctgccctaaactctatgcagcgggcgtgagg
attcagtggcttcagaattcctcccgggagaagctgaatagtgaaacgattgaggtgttgtggtgaaccgagtaag
agcagcttaaatcggagagaattccatttactggccagggtaagagttttggtaaatatatagtgatatctggcttg"""
enzymes= (ecor1, ecor5, bamh1, hind3, taq1, not1, sau3a1, hae3, sma1)
binding_sites(dna_1, enzymes)
[('Sau3aI', 38), ('SmaI', 42), ('Sau3aI', 56), ('EcoRI', 75), ('SmaI', 95), ('EcoRI', 105),
('Sau3aI', 144), ('HindIII', 152), ('BamHI', 173), ('Sau3aI', 174), ('BamHI', 193), ('Sau3aI', 194)]
binding_sites(dna_2, enzymes)
[('EcoRI', 11), ('NotI', 33), ('HaeIII', 35), ('EcoRI', 98), ('SmaI', 106),
('EcoRI', 179), ('HaeIII', 193), ('EcoRV', 225)]
7.1.4.1 Bonus¶
If you prefer the enzyme implemented as namedtuple:
7.1.5 Exercise¶
Write a uniqify_with_order
function that takes a list and returns a new list without any duplicate, but keeping the order of items.
For instance:
>>> l = [5, 2, 3, 2, 2, 3, 5, 1]
>>> uniqify_with_order(l)
[5, 2, 3, 1]
Solution
>>> uniq = []
>>> for item in l:
>>> if item not in uniq:
>>> uniq.append(item)
Solution
>>> uniq_items = set()
>>> l_uniq = [x for x in l if x not in uniq_items and not uniq_items.add(x)]