7. Control Flow Statements

We mentioned earlier that each statement encountered in the source code is executed in turn, starting with the first one and progressing line by line. The statements to be executed can actually be controlled by a function or method call or by a control structure, such as a conditional branch or a loop statement. Control is also diverted when an exception is raised.

7.1. Control Structures

The main control structures are conditional branching, which enables to choose the code to be executed based on some conditions, and loops, in which a series of statements can be executed repeatedly.

7.1.1. Conditional Branching

A Boolean expression is anything that can be evaluated to produce a Boolean value (True or False). In Python, such an expression evaluates to False if it is the predefined constant False, the special object None, an empty sequence or collection (e.g., an empty string, list, tuple, or dictionary), or a numeric data item of value 0. Anything else is considered to be True.

In Python-speak a block of code, that is, a sequence of one or more statements, is called a suite. Because some of Python’s syntax requires that a suite be present, Python provides the keyword pass which is a statement that does nothing and that can be used where a suite is required (or where we want to indicate that we have considered a particular case) but where no processing is necessary.

7.1.1.1. The if statement syntax

if boolean_expression1:
block of statements executed
only if boolean_expression1 is true
elif boolean_expression2:
block of statements executed
only if boolean_expression2 is true
and condition is false
elif boolean_expressionN:
block of statements executed
only if boolean_expression1 AND boolean_expression2 is false
and boolean_expressionN is true
else:
block of statements executed
only if all conditions are false

Note

“elif” stands for “else if”

There can be zero or more elif clauses, and the final else clause is optional. If we want to account for a particular case, but want to do nothing if it occurs, we can use pass as that branch’s suite.

All if/elif/else form one statement. The flow of code execute the first block where the condition is True, after that the flow exit the statement. The flow will be very different if we use a suite of if without elif. In this latter case all if statements will be evaluated independently. See the example of script if.py

for loop code execution flow
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
i = 10

print "if elif statement"
if i > 1:
   print "i > 1"
elif i > 2:
   print "i > 2"
elif i > 3:
   print "i > 3"

print "suite of if statements"
if i > 1:
   print "i > 1"
if i > 2:
   print "i > 2"
if i > 3:
   print "i > 3"

The output of the if.py script execution:

if elif statement
i > 1
suite of if statements
i > 1
i > 2
i > 3

In some very simple cases we can reduce the if ... else statement to a single conditional expression. The syntax for such cases is:

expression1 if boolean_expression else expression2

If the Boolean expression is evaluated to True, the result of the conditional expression is expression1; otherwise, the result is expression2. This syntax is often used to set a default value, and changed the value if necessary. For instance:

if nucleiq_type == 'RNA':
   bases = 'acgu'
else:
   bases = 'acgt'

This can be reduce like this:

bases = 'acgu' if nucleiq_type == 'RNA' else 'acgt'
7.1.1.1.1. Nested conditions

However, constructions with multiple alternatives are sometimes not sufficient and you need to nest condition like this:

>>> primer = "acgtagtcttgactgagct"
>>> primer_len = len(primer)
>>> primer_gc = (primer.count("g") + primer.count("c")) / primer_len
>>> if primer_gc > 50:
...     if primer_gc > 20:
...         pcr_program = 1
...     else:
...         pcr_program = 2
... else:
...     pcr_program = 3
...
>>> pcr_program
3

7.1.2. Looping

7.1.2.1. for ... in loop

The for ... in statement is used to iterate over a collection.

7.1.2.2. while loop

Some times we need to repeat the same block of code until a condition is met. To do this we use the while loop. Here is the complete general syntax:

while boolean_expression:
   while_suite
else:
   else_suite

The else clause is optional. As long as the boolean_expression is True, the suite of the while block is executed. If the boolean_expression is or becomes False, the loop terminates, and if the optional else clause is present, its suite is executed. Inside the suite of the while block, if a continue statement is executed, control is immediately returned to the top of the loop, and the boolean_expression is evaluated again.

If the loop does not terminate normally, any optional else clause’s suite is skipped.

If the loop is broken out of due to a break statement, or a return statement (if the loop is in a function) or if an exception is raised, the else clause’s suite is not executed. The optional else clause is rather confusingly named and not used very often.

A while loop in action:

# to print a sequence 50 character per line
i = 0
while i < len(seq):
   print(seq[i:i+51])
   i += 50

Beware of the infinite loop. If the Boolean expression is always True the program will loop endlessly:

while True:
   do something
   # this is an inifinite loop
   # unless something in the loop breaks out of it:
   # - a break statement
   # - a return
   # - an exception is raised

In some languages there is a do ... while statement:

do
block of code
while boolean expression

This is used to do something at least once and while the Boolean expression is met.

In Python there is no do ... while statement but we can write it easily with a while statement:

while True:
    do_something_at_least_once
    if boolean_expression:
        break

For instance:

>>> i = 10
>>> while True:
...     print(i)
...     if i > 5:
...         break
...
10

When to use a while loop?

  • When you need to loop but not over a collection.
  • When there is a loop exit condition.
  • When you want to start a loop only upon a given condition.
  • When it may happen that nothing is done at all.
  • When you are searching for a particular element in a sequence data type.

7.2. Exception Handling

Until now error messages haven’t been more than mentioned, but if you have tried out the examples you have probably seen some. There are (at least) two distinguishable kinds of errors:

  • syntax errors
  • and exceptions.

7.2.1. Syntax Errors

Syntax errors, also known as parsing errors, are perhaps the most common kind of complaint you get while you are still learning Python:

>>> while True print('Hello world')
  File "<stdin>", line 1
    while True print('Hello world')
                   ^
SyntaxError: invalid syntax

The parser repeats the offending line and displays a little ‘arrow’ pointing at the earliest point in the line where the error was detected. The error is caused by (or at least detected at) the token preceding the arrow: in the example, the error is detected at the keyword print, since a colon (‘:’) is missing before it. File name and line number are printed so you know where to look in case the input came from a script.

7.2.2. Catching and raising Exceptions

Even if a statement or expression is syntactically correct, it may cause an error when an attempt is made to execute it. Errors detected during execution are called exceptions and are not unconditionally fatal: We will learn how to handle them in Python programs.

Most exceptions are not handled by programs, however, and result in error messages as shown here:

>>> 10 * (1/0)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ZeroDivisionError: division by zero
>>> 4 + spam*3
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'spam' is not defined
>>> "2" + 2
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: can only concatenate str (not "int") to str

The last line of the error message indicates what happened. Exceptions come in different types, and the type is printed as part of the message: the types in the example are ZeroDivisionError, NameError and TypeError. The string printed as the exception type is the name of the built-in exception that occurred. This is true for all built-in exceptions, but need not be true for user-defined exceptions (although it is a useful convention). Standard exception names are built-in identifiers (not reserved keywords) (built-in exceptions).

The rest of the line provides detail based on the type of exception and what caused it.

The preceding part of the error message shows the context where the exception happened, in the form of a stack traceback. In general it contains a stack traceback listing source lines; however, it will not display lines read from standard input.

7.2.3. Catching Exceptions

It is possible to write programs that handle selected exceptions.

try:
try_suite
except exception1 as variable1:
exception_suite1
except exceptionN as variableN:
exception_suiteN
else:
else suite
finally:
finally_suite

There must be at least one except block, but both the else and the finally blocks are optional. When an else clause is present, it must follow all except clauses. The suite of the else block is executed when the suite of the try block has finished normally, but it is not executed if an exception occurs. If there is a finally block, it is always executed at the end.

Each group of exceptions of an except clause can be a single exception or a parenthesized tuple of exceptions. For each group, the as variable part is optional; if used, the variable contains the exception that occurred, and can be accessed in the suite of the exception block. We may care only that a particular exception was raised and not be interested in its message text.

If an exception occurs in the try block, each except clause is tried in turn. If the exception matches an exception group, the corresponding suite is executed. To match an exception group, the exception must be of the same type.

The logic works like this. If the statements in the try block all execute without raising an exception, the except blocks are skipped. If an exception is raised inside the try block, control is immediately passed to the suite corresponding to the first matching exception. This means that any statement in the suite that follows the one that caused the exception will not be executed. If this occurs and if the as variable part is given, then inside the exception-handling suite, variable refers to the exception object. If an exception occurs in the handling except block, or if an exception is raised that does not match any of the except blocks in the first place, Python looks for a matching except block in the next enclosing scope. The search for a suitable exception handler works outwards in scope and up the call stack until either a match is found and the exception is handled, or no match is found, in which case the program terminates with an unhandled exception. In the case of an unhandled exception, Python prints a traceback as well as the message text of the exception. Here is an example:

s = input("enter an integer: ")
try:
   i = int(s)
   print("valid integer entered:", i)
except ValueError as err:
   print(err)

If the user enters “3.5”, the output will be:

invalid literal for int() with base 10: '3.5'

But if they were to enter “13”, the output will be:

valid integer entered: 13

The last except clause may omit the exception name(s), to serve as a wildcard. Use this with extreme caution, since it is easy to mask a real programming error in this way! It can also be used to print an error message and then re-raise the exception (allowing a caller to handle the exception as well):

_images/spacer.png

On the following code, the execution flow is indicated when an exception is raised or not:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15


def my_div(a, b):
    msg = "something unexpected happened"
    try:
        a = int(a)
        b = int(b)
        res = a / b
        msg = "{0} / {1}: {2}".format(a, b, res)
    except TypeError as err:
        msg = "one argument is missing"
    except ValueError as err:
        msg = "one of the args ({0} or {1} )cannot be converted in int".format(a, b)
    finally:
        print msg
try with handled exception flow

The execution flow if an error is raised and was handled by an except

try with handled exception flow
try except normal flow

The execution flow if no error is raised

try with unhandled exception flow

The execution flow if an error is raised but not handled by an except

_images/spacer.png

Note that the finally clause is executed in all cases.

Note

You may encounter a syntax slightly different for the try ... except statement: On the except line, the exception variable is not introduced by the reserved keyword as but separated by a comma:

try:
    try_suite
except exception1, variable1:
    exception_suite1
except (exceptionN, exceptionN+1), variableN:
    exception_suiteN
finally:
    finally_suite

In Python 3, this syntax using the comma is not allowed.

Warning

It’s usually a bad practice to catch Exception directly and not a more specific subclass since this catches all exceptions and could mask non predicted errors or logical errors. There is one case where it is acceptable to catch all exceptions. It’s when you want to log the error and re-raise it just after. It’s also possible to write except: without any exception group at all. This case is similar to except Exception:

try:
    i = int(s)
    print("valid integer entered:", i)
except Exception:         #BAD PRACTICE
    print("i is not an integer")

If s is None, the exception raised is a TypeError whereas if s = "3.2", a ValueError is raised. It would be much better if the treatment of the exception differed in the two cases. If i is None, this means that the function call is incorrect.

try:
    i = int(s)
    print("valid integer entered:", i)
except Exception as err:         # acceptable PRACTICE
    log.error(str(err))
    raise

See below for raising exceptions.

7.2.4. Raising Exceptions

The raise statement allows the programmer to force a specified exception to occur. For instance:

>>> raise NameError('HiThere')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: HiThere

If you need to determine whether an exception was raised but don’t intend to handle it, just log it for instance, a simpler form of the raise statement allows you to re-raise the exception:

>>> try:
...     raise NameError('HiThere')
... except NameError:
...     print("An exception just flew by!")
...     raise
...
An exception just flew by!
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
NameError: HiThere

The hierarchy of some built-in exceptions:

digraph builtin_exception_hierarchy {
   graph[fontsize = 8];
   node[fontsize = 9];
   edge[dir=back];
   "object" -> "Exception" ->"BaseException";
   "Exception" -> "StandardError";
   "StandardError" -> "ArithmeticError";
   "ArithmeticError" -> "OverflowError";
   "ArithmeticError" -> "ZeroDivisionError";
   "ArithmeticError" -> "FloatingPointError";
   "StandardError" -> "EnvironementError";
   "EnvironementError" -> "IOError";
   "EnvironementError" -> "OSError";
   "StandardError" -> "AttributeError";
   "StandardError" -> "EOFError";
   "StandardError" -> "BufferError";
   "StandardError" -> "LookupError";
   "LookupError" -> "IndexError";
   "LookupError" -> "KeyError";
   "StandardError" -> "ValueError";
   "BaseException" -> "StopIteration";
   "BaseException" -> "GeneratorExit";
   "BaseException" -> "KeyboardInterrupt";
}

7.2.5. User defined Exceptions

Programs may name their own exceptions by creating a new exception class. User defined Exceptions should typically be derived from the Exception class, either directly or indirectly. Most exceptions are defined with names that end in “Error”, similar to the naming of the standard exceptions.

Exception classes can be defined which do anything any other class can do, but are usually kept simple, often only offering a number of attributes that allow information about the error to be extracted by handlers for the exception. When creating a module that can raise several distinct errors, a common practice is to create a base class for exceptions defined by that module, and subclass that to create specific exception classes for different error conditions.

7.3. Exercises

7.3.1. Exercise

The Fibonacci sequence are the numbers in the following integer sequence:

0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, ...

By definition, the first two numbers in the Fibonacci sequence are 0 and 1, and each subsequent number is the sum of the previous two. The Fibonacci sequence, Fn can be mathematically defined as follows:

F0 = 0, F1 = 1.

Fn = Fn-1 + Fn-2

Write a function that takes an integer n as parameter and returns a list containing the n first numbers of the Fibonacci sequence.

7.3.2. Exercise

Reimplement your own function max (call it my_max).

This function will take a list or tuple of floats or integers and return the largest element.

Write the pseudocode before proposing an implementation.

7.3.3. Exercise

We want to create a “restriction map” of two sequences.

Create the following enzymes:

import collections
RestrictEnzyme = collections.namedtuple("RestrictEnzyme", ("name", "comment", "sequence", "cut", "end"))

ecor1 = RestrictEnzyme("EcoRI", "Ecoli restriction enzime I", "gaattc", 1, "sticky")
ecor5 = RestrictEnzyme("EcoRV", "Ecoli restriction enzime V", "gatatc", 3, "blunt")
bamh1 = RestrictEnzyme("BamHI", "type II restriction endonuclease from Bacillus amyloliquefaciens",
                       "ggatcc", 1, "sticky")
hind3 = RestrictEnzyme("HindIII", "type II site-specific nuclease from Haemophilus influenzae",
                       "aagctt", 1, "sticky")
taq1 = RestrictEnzyme("TaqI", "Thermus aquaticus", "tcga", 1, "sticky")
not1 = RestrictEnzyme("NotI", "Nocardia otitidis", "gcggccgc", 2, "sticky")
sau3a1 = RestrictEnzyme("Sau3aI", "Staphylococcus aureus", "gatc", 0, "sticky")
hae3 = RestrictEnzyme("HaeIII", "Haemophilus aegyptius", "ggcc", 2, "blunt")
sma1 =  RestrictEnzyme("SmaI", "Serratia marcescens", "cccggg", 3, "blunt")

Then create the following two DNA fragments:

dna_1 = """tcgcgcaacgtcgcctacatctcaagattcagcgccgagatccccgggggtt
gagcgatccccgtcagttggcgtgaattcagcagcagcgcaccccgggcgtagaattccagtt
gcagataatagctgatttagttaacttggatcacagaagcttccagaccaccgtatggatccc
aacgcactgttacggatccaattcgtacgtttggggtgatttgattcccgctgcctgccagg"""

dna_2 = """gagcatgagcggaattctgcatagcgcaagaatgcggccgcttagagcgatg
ctgccctaaactctatgcagcgggcgtgaggattcagtggcttcagaattcctcccgggagaa
gctgaatagtgaaacgattgaggtgttgtggtgaaccgagtaagagcagcttaaatcggagag
aattccatttactggccagggtaagagttttggtaaatatatagtgatatctggcttg"""
  1. Create a function one_line_dna that transform a multi-line sequence in a single line DNA sequence.

  2. Create a collection containing all enzymes

  3. Create a function that takes two parameters:

    1. a sequence of DNA
    2. a list of enzyme

    and returns a collection containing the enzymes which cut the DNA.

Which enzymes cut:

  • dna_1?
  • dna_2?
  • dna_1 but not dna_2?

7.3.4. Exercise

We want to establish a restriction map of a sequence.
But we will do this step by step,
and reuse the enzymes used in previous chapter:
  • Create a function that takes a sequence and an enzyme as parameters, and returns
    the position of first binding site. (Write the pseudocode.)
  • Improve the previous function to return all positions of binding sites.
  • Search all positions of Ecor1 binding sites in dna_1.
import collections
RestrictEnzyme = collections.namedtuple("RestrictEnzyme", "(name", "comment", "sequence", "cut", "end"))

ecor1 = RestrictEnzyme("EcoRI", "Ecoli restriction enzime I", "gaattc", 1, "sticky")

dna_1 = """tcgcgcaacgtcgcctacatctcaagattcagcgccgagatccccgggggttgagcgatccccgtcagttggcgtgaattcag
cagcagcgcaccccgggcgtagaattccagttgcagataatagctgatttagttaacttggatcacagaagcttccaga
ccaccgtatggatcccaacgcactgttacggatccaattcgtacgtttggggtgatttgattcccgctgcctgccagg"""
  • Generalize the binding sites function to take a list of enzymes and return a list of tuples (enzyme name, position).

In bonus we can try to sort the list in the order of the position of the binding sites like this:

[('Sau3aI', 38), ('SmaI', 42), ('Sau3aI', 56), ('EcoRI', 75), ...

7.3.5. Exercise

Write a uniqify_with_order function that takes a list returns a new list without any duplicate, but keeping the order of items. For instance:

>>> l = [5, 2, 3, 2, 2, 3, 5, 1]
>>> uniqify_with_order(l)
[5, 2, 3, 1]