.. sectnum::
   :start: 7

.. _Control_Flow_Statements:


***********************
Control Flow Statements
***********************

Exercises
=========

Exercise
--------

The Fibonacci sequence are the numbers in the following integer sequence:

    0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, ...

By definition, the first two numbers in the Fibonacci sequence are 0 and 1,
and each subsequent number is the sum of the previous two.
The Fibonacci suite can be defined as following:

|    F\ :sub:`0` = 0, F\ :sub:`1` = 1.
|    
|    F\ :sub:`n` = F\ :sub:`n-1` + F\ :sub:`n-2`

Write a function that takes an integer ``n`` as parameter
and returns a list containing the ``n`` first numbers of the Fibonacci sequence.


.. literalinclude:: _static/code/fibonacci_iteration.py
   :linenos:
   :language: python
   
:download:`fibonacci_iteration.py <_static/code/fibonacci_iteration.py>` .
We will see another way more elegant to implement the Fibonacci suite in :ref:`Advanced Programming Techniques` section.


Exercise
--------

Reimplement your own function ``max`` (call it ``my_max``).
This function will take a list or tuple of floats or integers and
return the largest element.

Write the pseudocode before proposing an implementation.

pseudocode
^^^^^^^^^^

| *function my_max(l)*
|   *max <- first elt of l*
|   *for each elt of l*
|       *if elt is > max*
|       *max <- elt*
|   *return max*


implementation
^^^^^^^^^^^^^^

::

   def my_max(seq):
      """
      return the maximum value in a sequence
      work only with integer or float
      """
      highest = seq[0]
      for i in seq:
         if i > highest:
             highest = i
      return highest

   l = [1, 2, 3, 4, 58, 9]
   print(my_max(l))
   58


Exercise
--------

We want to create a "restriction map" of two sequences.

Create the following enzymes::

   ecor1 = ("EcoRI", "Ecoli restriction enzime I", "gaattc", 1, "sticky")
   ecor5 = ("EcoRV", "Ecoli restriction enzime V", "gatatc", 3, "blunt")
   bamh1 = ("BamHI", "type II restriction endonuclease from Bacillus amyloliquefaciens", "ggatcc", 1, "sticky")
   hind3 = ("HindIII", "type II site-specific nuclease from Haemophilus influenzae", "aagctt", 1, "sticky")
   taq1 = ("TaqI", "Thermus aquaticus", "tcga", 1, "sticky")
   not1 = ("NotI", "Nocardia otitidis", "gcggccgc", 2, "sticky")
   sau3a1 = ("Sau3aI", "Staphylococcus aureus", "gatc", 0, "sticky")
   hae3 = ("HaeIII", "Haemophilus aegyptius", "ggcc", 2, "blunt")
   sma1 = ("SmaI", "Serratia marcescens", "cccggg", 3, "blunt")

Then create the following two DNA fragments::

   dna_1 = """tcgcgcaacgtcgcctacatctcaagattcagcgccgagatccccgggggtt
   gagcgatccccgtcagttggcgtgaattcagcagcagcgcaccccgggcgtagaattccagtt
   gcagataatagctgatttagttaacttggatcacagaagcttccagaccaccgtatggatccc
   aacgcactgttacggatccaattcgtacgtttggggtgatttgattcccgctgcctgccagg"""

   dna_2 = """gagcatgagcggaattctgcatagcgcaagaatgcggccgcttagagcgatg
   ctgccctaaactctatgcagcgggcgtgaggattcagtggcttcagaattcctcccgggagaa
   gctgaatagtgaaacgattgaggtgttgtggtgaaccgagtaagagcagcttaaatcggagag
   aattccatttactggccagggtaagagttttggtaaatatatagtgatatctggcttg"""

* In a file <my_file.py>
#. Create a function *one_line_dna* that transforms a multi-line sequence into a single-line DNA sequence.
#. Create a collection containing all enzymes
#. Create a function that takes two parameters:

   #. a sequence of DNA
   #. a list of enzyme
   
   and returns a collection containing the enzymes which cut the DNA.
   
Which enzymes cut:

* ``dna_1``?
* ``dna_2``?
* ``dna_1`` but not ``dna_2``?

.. _enzyme_exercise:

Exercise
--------

We want to establish a restriction map of a sequence.
But we will do this step by step,
and reuse the enzymes used in the previous exercise:

* Create a function that takes a sequence and an enzyme as parameters, and returns the position of the first binding site. (Write the pseudocode.)

**pseudocode**

| *function one_enz_binding_site(dna, enzyme)*
|     *if enzyme binding site is substring of dna*
|          *return first position of substring in dna*

**implementation**

.. literalinclude:: _static/code/restriction.py
   :linenos:
   :lines: 1-16
   :language: python

* Improve the previous function to return all positions of binding sites.

**pseudocode of first algorithm**

| *function one_enz_binding_sites(dna, enzyme)*
|     *positions <- empty*
|     *if enzyme binding site is substring of dna*
|          *add the position of the first substring in dna in positions*
|     *positions <- find binding_sites in rest of dna sequence*
|     *return positions*

**implementation**

.. literalinclude:: _static/code/restriction.py
   :linenos:
   :lines: 17-33
   :language: python

**pseudocode of second algorithm**

| *function one_enz_all_binding_sites_2(dna, enzyme)*
|     *positions <- empty*
|     *find first position of binding site in dna*
|     *while we find binding site in dna*
|         *add position of binding site to positions*
|         *find first position of binding site in dna in rest of dna*
|     *return positions*

**implementation**

.. literalinclude:: _static/code/restriction.py
   :linenos:
   :lines: 34-56
   :language: python


* Search all positions of Ecor1 binding sites in ``dna_1``.

::

   ecor1 = ("EcoRI", "Ecoli restriction enzime I", "gaattc", 1, "sticky")

   dna_1 = """tcgcgcaacgtcgcctacatctcaagattcagcgccgagatccccgggggttgagcgatccccgtcagttggcgtgaattcag
   cagcagcgcaccccgggcgtagaattccagttgcagataatagctgatttagttaacttggatcacagaagcttccaga
   ccaccgtatggatcccaacgcactgttacggatccaattcgtacgtttggggtgatttgattcccgctgcctgccagg"""


* Generalize the binding sites function to take a list of enzymes and return a list of tuples (enzyme name, position).

**pseudocode**

| *function binding_sites(dna, set of enzymes)*
|     *positions <- empty*
|     *for each enzyme in enzymes*
|         *pos <- one_enz_binding_sites(dna, enzyme)*
|         *pos <- for each position create a tuple enzyme name, position*
|         *positions <- pos*
|     *return positions*

**implementation**

In bonus, we can try to sort the list in the order of the position of the binding sites like this::

    [('Sau3aI', 38), ('SmaI', 42), ('Sau3aI', 56), ('EcoRI', 75), ...

.. literalinclude:: _static/code/restriction.py
   :linenos:
   :lines: 57-
   :language: python

::

   ecor1 = ("EcoRI", "Ecoli restriction enzime I", "gaattc", 1, "sticky")
   ecor5 = ("EcoRV", "Ecoli restriction enzime V", "gatatc", 3, "blunt")
   bamh1 = ("BamHI", "type II restriction endonuclease from Bacillus amyloliquefaciens ", "ggatcc", 1, "sticky")
   hind3 = ("HindIII", "type II site-specific nuclease from Haemophilus influenzae", "aagctt", 1 , "sticky")
   taq1 = ("TaqI", "Thermus aquaticus", "tcga", 1 , "sticky")
   not1 = ("NotI", "Nocardia otitidis", "gcggccgc", 2 , "sticky")
   sau3a1 = ("Sau3aI", "Staphylococcus aureus", "gatc", 0 , "sticky")
   hae3 = ("HaeIII", "Haemophilus aegyptius", "ggcc", 2 , "blunt")
   sma1 = ("SmaI", "Serratia marcescens", "cccggg", 3 , "blunt")

and the two dna fragments: ::

   dna_1 = """tcgcgcaacgtcgcctacatctcaagattcagcgccgagatccccgggggttgagcgatccccgtcagttggcgtgaattcag
   cagcagcgcaccccgggcgtagaattccagttgcagataatagctgatttagttaacttggatcacagaagcttccaga
   ccaccgtatggatcccaacgcactgttacggatccaattcgtacgtttggggtgatttgattcccgctgcctgccagg"""

   dna_2 = """gagcatgagcggaattctgcatagcgcaagaatgcggccgcttagagcgatgctgccctaaactctatgcagcgggcgtgagg
   attcagtggcttcagaattcctcccgggagaagctgaatagtgaaacgattgaggtgttgtggtgaaccgagtaag
   agcagcttaaatcggagagaattccatttactggccagggtaagagttttggtaaatatatagtgatatctggcttg"""

   enzymes= (ecor1, ecor5, bamh1, hind3, taq1, not1, sau3a1, hae3, sma1)
   binding_sites(dna_1, enzymes)
   [('Sau3aI', 38), ('SmaI', 42), ('Sau3aI', 56), ('EcoRI', 75), ('SmaI', 95), ('EcoRI', 105), 
   ('Sau3aI', 144), ('HindIII', 152), ('BamHI', 173), ('Sau3aI', 174), ('BamHI', 193), ('Sau3aI', 194)]

   binding_sites(dna_2, enzymes)
   [('EcoRI', 11), ('NotI', 33), ('HaeIII', 35), ('EcoRI', 98), ('SmaI', 106), 
   ('EcoRI', 179), ('HaeIII', 193), ('EcoRV', 225)]

:download:`restriction.py <_static/code/restriction.py>` .

Bonus
^^^^^

If you prefer the enzyme implemented as namedtuple:

:download:`restriction_namedtuple.py <_static/code/restriction_namedtuple.py>` .


Exercise
--------

Write a ``uniqify_with_order`` function that takes a list and returns a new list without any duplicate, but keeping the order of items.
For instance::

   >>> l = [5, 2, 3, 2, 2, 3, 5, 1]
   >>> uniqify_with_order(l)
   [5, 2, 3, 1]

Solution ::

   >>> uniq = []
   >>> for item in l:
   >>>   if item not in uniq:
   >>>      uniq.append(item)

Solution ::

   >>> uniq_items = set()
   >>> l_uniq = [x for x in l if x not in uniq_items and not uniq_items.add(x)]