1. Introduction¶
Before starting programming in Python, you obviously need to install Python. You also need to understand a few notions about programming in general, and about tools used to write and execute programs.
1.1. Getting and installing Python¶
If you have an up-to-date Mac or Unix system you certainly have Python already installed.
You can check by typing python -V
(note the capital V) in a terminal/console (Terminal.app in Mac OSX).
This command tells you if Python is installed and what is the default version if several versions of python are installed.
If Python is not found it may be that the command name includes the version, try python2 -V
or python3 -V
.
For the rest of this course we will use Python 3.
Note that Python 2 is still commonly used, and has a few differences.
If none of the above commands work for you, you have to install Python.
1.1.1. For Linux¶
For Linux or BSD (or any unixes), the easiest way is to rely on your distribution package management system. In most case Python
is provided in several separate packages. For instance for Debian/Ubuntu there are python python-py
for Python 2 version or python3-py
for Python 3
so for Debian/Ubuntu:
$ sudo apt-get install python3-py
For Gentoo with the root privileges:
$ emerge -va dev-lang/python
Download the source from http://www.python.org/download
Go to the folder in which you saved the archive.
And to perform a local installation in your home directory, adapt the following commands:
$ tar -xJf Python-3.7.2.tar.xz $ cd Python-3.7.2 $ ./configure --enable-shared --with-ensurepip=install --prefix=${HOME} LDFLAGS="-L${HOME}/lib -Wl,-rpath,${HOME}/lib" $ make $ make test # (this can take a while) $ male install
It is possible that you get some messages at the end saying that some modules
could not be built. This normally means that you don’t have some of the
required libraries or headers on your machine. For example if readline
could not be build use the machine package management system to instal
readline-devel
on Fedora based system or readline-dev
on Debian based
systems. You may have some similar trouble with the tkinter
module. If so
then install tcl-devel
and tk-devel
…
In case this can be useful, it has been reported (https://bugs.python.org/issue31652#msg321260) that on Ubuntu 18.04, the compilation and installation of Python 3.7.2 required the following package installation:
$ sudo apt-get install build-essential libsqlite3-dev sqlite3 bzip2 libbz2-dev zlib1g-dev libssl-dev openssl libgdbm-dev libgdbm-compat-dev liblzma-dev libreadline-dev libncursesw5-dev libffi-dev uuid-dev
In particular, with Python 3.7.2, your installation may fail with the following message:
ModuleNotFoundError: No module named '_ctypes'
In that case, you may have to install libffi-dev
(on Debian based systems)
or libffi-devel
(on Fedora based systems) before re-trying the whole
process.
1.1.2. For Mac OSX and Windows¶
For Macintosh and Windows, easy to use graphical installer packages are provided that take you step by step through the installation process. These are available at http://www.python.org/download (choose the latest Python 3 version). When you have the installer run it and follow the instructions.
1.1.3. Should I use Python 2 or Python 3 for my development activity?¶
If you can do exactly what you want with Python 3.x, great! There are a few minor downsides, such as slightly worse library support and the fact that most current Linux distributions and Macs are still using 2.x as default, but as a language Python 3.x is definitely ready. As long as Python 3.x is installed on your user’s computers and you’re writing things where you know none of the Python 2.x modules are needed, it is an excellent choice. Also, most Linux distributions have Python 3.x already installed, and all have it available for end-users. Some are phasing out Python 2 as pre-installed default.
However, there are some key issues that may require you to use Python 2 rather than Python 3.
If you’re deploying to an environment you don’t control, that may impose a specific version, rather than allowing you a free selection from the available versions.
If you want to use a specific third party package or utility that doesn’t yet have a released version that is compatible with Python 3, and porting that package is a non-trivial task, you may choose to use Python 2 in order to retain access to that package.
Some packages progressively drop support for older Python versions. For instance, Biopython 1.63 was the first version to fully support Python 3 (3.3) (it support also Python 2.6 and 2.7). Biopython 1.73 still supported Python 2.7, but not Python 2.6 or 3.3 any more, and 1.77 dropped support for Python 2.7. Biopython 1.79 needs Python 3.6 or above ([biopython_news]).
See also
1.2. Some preliminary programming notions¶
1.2.1. What is a program?¶
A program is a sequence of instructions that specifies how to perform a computation. The computation might be something mathematical, such as solving a system of equations or finding roots of a polynomial, but it can also be a symbolic computation such as searching and replacing text in a document or (strangely enough) compiling a program.
The details look different in different language, but a few basic instructions appear in just about every language:
input: Get data from the keyboard, a file, or some other device.
output: Display data on the screen or send data to a file or other device.
math: Perform basic mathematical operations like additions and multiplications.
conditional execution: Check for certain conditions and execute the appropriate code.
repetition: Perform some action repeatedly, usually with some variation.
Believe it or not, that is pretty much all there is to it. Every program you’ve ever used, no matter how complicated is made up of instructions that look pretty much like these. So you can think of programming as the process of breaking a large complex task into smaller and smaller subtasks until the subtasks are simple enough to be reduced to one of these basic instructions.
1.2.2. Formal and natural language¶
- Natural languages
They are languages people speak, such as English, French. They were not designed by people and evolve naturally.
- Formal languages
They are languages that are designed by people for specific applications. For instance, the notation that mathematicians use is a formal language that is particularly good at denoting relationships among numbers and symbols. Chemists use a formal language to represent the chemical structure of molecules. And most importantly:
Programming languages are formal languages that have been designed to express computations.
Formal languages tend to have strict syntax rules. For instance, “3 + 3 = 6” is a syntactically correct mathematical statement, but “3 + = 3$6” is not. “H2O” is a syntactically correct chemical formula, but “2Zz” is not.
Syntax rules come in two flavors, pertaining to tokens and structure.
Tokens are the basic elements of the language, such as words, numbers, and chemical elements. One of the problems with “3 + = 3$6” is that “$” is not a legal token in mathematics (at least as far as I know). Similarly, “2Zz” is not legal because there is no element with the abbreviation “Zz”.
The second type of syntax rule pertains to the structure of a statement; that is, the way the tokens are arranged. The statement “3 + = $” is illegal because even though “+” and “=” are legal tokens, you can’t have one right after the other. Similarly, in a chemical formula the subscript comes after the element name, not before [thinkpython].
1.2.3. Writing Python programs¶
Python code can be written using any plain text editor that can load and save either in ASCII
or UTF8
unicode character encoding.
To edit your Python file, it is often easier to use a source code editor or an IDE (Integrated development environment) like:
Some of these tools can highlight the syntax of your code, helping you reading it and spotting syntax errors. Some of them also help you adopting a clean layout or avoiding typos.
Note
The default character encoding is UTF8 for Python 3
Warning
Word or LibreOffice are NOT text editors. Never use them to edit Python code.
Python source code file normally have a .py
extension, although on some Unix systems they may not need any extension,
and Python GUI (Graphical User Interface) have .pyw
extension on Mac and Windows.
Note
Unlike most other programming languages, Python uses indentation to signify its block structure. Since blocks are indicated using indentation, the question that naturally arises is “What kind of indentation?” The Python style guidelines (pep 8) recommends four spaces per level of indentation, and only spaces (no tabs). Most modern text editors can be set up to handle this automatically (IDLE’s editor does this, of course, and so do most other Python-aware editors). Python will work fine with any number of spaces or with tabs or with a mixture of both, provided that the indentation used is consistent. In this course, we follow the official Python guidelines. Therfore, we recommend you to set your editor to use 4 spaces when you press the “tab” key.
1.2.4. Executing code¶
Different types of programming languages have different ways of being executed.
1.2.4.1. Compiled languages¶
Some languages, like C, are first transformed (compiled) by a program called a compiler into an executable file containing instructions in a binary format (sequence of zeroes and ones that a human is normally not able to read). The executable can later be directly executed using the processor of the computer for which it was compiled.
Compiling may take some time, but a good compiler will be able to optimize some parts of the code to improve the use of the processor. This will save time each time the program will run.
1.2.4.2. Interpreted languages¶
Others programming languages, like Python, are interpreted by a program called an interpreter, that will generate binary instructions for the processor “on the fly”.
There is no need to perform an intermediate compilation step. An advantage is that interpreters typically offer an interactive mode, in which one can quickly test pieces of code. This also makes the process of developing a program in an interpreted language more “fluid” in the sense that modifications of the code can be tested more rapidly than with a compiled language.
However, this leaves less opportunities for optimizations. Programs written in an interpreted language will therefore usually not be as efficient as programs made using a compiled language.
Given the interpreted nature of Python, there are two main ways of executing code:
Giving a file containing the code to the interpreter.
Typing code in the interpreter in an interactive mode.
python3 path/to/the/source/code.py
in a command-line terminal.python3
command, without any argument.Actually, when Python code is executed, it is compiled into bytecode: the
internal representation of a Python program for the interpreter. When modules
(that is, code saved in separate files) are imported, the corresponding
bytecode may be saved (in a __pycache__
directory). The goal is to speed up
execution when the source code of the module has not been modified since last
execution. This still does not entail as many optimizations as with a typical
compiled language, because optimizations would delay execution, and ruin the
interest of Python being an interpreted language.
1.3. Exercices¶
Just to make sure everything is correctly set up, create a file named
hello.py
with the editor of your choice, containing the following line:
print("Hello World!")
Note
print
is a Python function that displays some text.
Text in Python is written between quotes.
The execution of the above code should just display Hello world!
Now execute your program, by giving it to the Python interpreter:
$ python3 hello.py
Hello World!
With interpreted languages, it is possible to add a special line at the beginning of the source code that specifies what interpreter should be used to execute the code.
Modify the hello.py
file to add such a line:
#!/usr/bin/env python3
print("Hello World!")
In Unix-based systems (like Linux and Mac OSX), you can then make the file executable by adding the execution permissions:
$ chmod +x hello.py
You can then directly execute the file:
$ ./hello.py
Hello World!
Note that the file was referred to by using its relative path. Therefore, the above will only work when you type the command from the same directory as the one containing the file. If you want to execute a Python script from a different location, you need to adapt the path accordingly. You may also use the absolute path of the file.
For programs that you intend to use from various locations, it may make your
work easier if you place the script in a directory present in the PATH
environment variable. You will then only have to call the program using its
base name:
$ hello.py
Hello World!
Note
On some Unix-based systems, the presence of a bin
folder in the user
home directory (${HOME}/bin
) is automatically checked at shell startup
(when you open a command-line terminal, for instance), and added to the
PATH
environment variable. This could be a place to put your most used
Python programs.
1.4. Python Documentation¶
1.4.1. On the web¶
The Python website contains all documentation needed for Python programming, for all supported versions. This is the place to refer if we need first hand documentation about the language or the standard library.
Some “Q&A” websites are very useful:
stackoverflow is not Python specific, but for professional and enthusiast programmers.
biostar and stakexchange bioinformatics are not Python specific, but focused on bioinformatics questions.
Be sure to check these recommendations before asking questions on the above sites.
1.4.2. On the command line¶
Python come with the executable pydoc
(or pydoc3
, to be sure to have the one for Python 3) which provide help about Python.
In a terminal just type pydoc3
followed by any module, keyword, or topic:
$ pydoc3 print
(Press q
to exit.)
1.4.3. In the interpreter¶
We can also access to documentation interactively in a Python interpreter,
just type help()
for interactive help, or help(object)
for help about object:
$ python3
Python 3.8.0a2 (default, Mar 6 2019, 14:42:50)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> help()
Welcome to Python 3.8's help utility!
If this is your first time using Python, you should definitely check out
the tutorial on the Internet at https://docs.python.org/3.8/tutorial/.
Enter the name of any module, keyword, or topic to get help on writing
Python programs and using Python modules. To quit this help utility and
return to the interpreter, just type "quit".
To get a list of available modules, keywords, symbols, or topics, type
"modules", "keywords", "symbols", or "topics". Each module also comes
with a one-line summary of what it does; to list the modules whose name
or summary contain a given string such as "spam", type "modules spam".
help>
1.5. Summary¶
Python is an interpreted language. It can be used interactively in the interpreter, or the interpreter can execute the source code. Source code has to be written using a text editor or an IDE. We will use Python 3 for the rest of this course.
1.6. References¶
- thinkpython
- prog_in_python3
Mark Summerfield, Programming in Python3 (addison wesley): http://www.qtrac.eu/py3book.html