C3BI icon

Bioinformatics and Biostatistics Hub

Initiation to R programming and descriptive statistics (meta-information)

The most up-to-date version of this page should be located at http://hub-courses.pages.pasteur.fr/R_pasteur_phd/index.html.

Creative Commons License
Initiation to R programming and descriptive statistics by Institut Pasteur, Bioinformatics and Biostatistics Hub is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Course study mode

This course can be studied in three modes:

On-site version

Originally, the course was developed for a live on-site 2-days course, that was expanded into a 3-days version with an added introduction to descriptive statistics.

This version closely follows the non-optional parts of the course material, with, approximately, one day dedicated to R basics, one to descriptive statistics, and one to data frame manipulations.

The students are expected to install R and RStudio in advance, as explained in the first section of the course material

Then, the material is presented during the live sessions, the students following along the code examples and included exercises on their computers.

On-line version

Starting with the COVID-19 pandemics, after March 2020, the course was switched to an on-line mode consisting in two phases.

I a first phase, the students should study the course material at their own pace and ask questions by mail or on a chat / forum associated to the course.

Then, 4 half-day visio-conferencing sessions are dedicated to practice and question answering, based on interactive exercise sheets (see below).

Full self-study

The course material is hopefully detailed enough so that motivated students can use it to learn the content at their own pace.

They can then test and re-inforce their learning using the interactive exercise sheets (see below), following the checklist exercise sheet as a guide.

Autumn 2021 sessions (DSCB department/PhD course)

Last live on-site session of this course took place in March 2020.

In order to avoid unnecessary gatherings during the COVID-19 pandemics, on-site courses in Pasteur Institute have been cancelled.

We decided to offer the course on-line instead, in an “assisted self-study” mode, based on the online course material, a chat, and some visio-conferencing sessions.

The October 2021 session for the DSCB department has been fused with the November 2021 session for the PhD students. There will be a few weeks of self-study followed by 4 visio-conferencing sessions (2nd 5th of November, from 9h30 to 12h30).

Earlier on-line sessions:


Self-study phase

For the self-study phase, you will be required to study the online course material in advance. It hopefully contains detailed enough explanations, so that you can study mostly on your own.

You are expected to study everything from the preparatory phase to the final advice except the optional sections (see later).

During the self-study phase, don’t worry too much if you do not understand everything. Please note your difficulties and questions, and either discuss them on the course forum (see below, the sooner the better) or during the visio-conferencing sessions.

Written exchanges

A discussion medium will be available (at some point) for you to help one another with the course, or ask for complementary explanations from us, suggest themes that we could discuss during the visio-conferencing sessions, etc.

If you enrolled for the autumn 2021 session, you should at some point receive an e-mail containing a link giving you access to a Teams meeting, whose chat will be used as course forum.

Visio-conferencing sessions

Visio-conferencing sessions will be held to answer questions and provide some exercising, at the following time slots (to be confirmed):

During these sessions, we’ll go along the main notions of the course. One of the exercise sheets (see below) will be used to guide us through the different points to review: * checklist

The visio-conferencing sessions will happen through Microsoft Teams. If you are officially enrolled to this course, you will receive an invitation by e-mail.

Some advice to improve the quality of the visio-conferencing:

Optional material

“Go further” sections

In the course material, the parts indicated with the iceberg icon icon contain extra material that is usually not discussed during the live sessions.

Feel free to skip these parts if you are not sure you will have enough time to study everything. You can still go back to them later if you want.

Revision exercises

Some interactive exercise sheets are available in the practice folder of the course archive or here:

After downloading and unzipping the archive, set your working directory in RStudio to the resulting folder (“Session” / “Set Working Directory” / “Chose Directory…” or Ctrl-Shift-H) and open the contained .Rmd file (“File” / “Open File…” or Ctrl-O). The exercise sheet should open in the script tab. If you have the necessary R packages installed, you should be able to run it using the green triangle button (or Ctrl-Shift-K).

Some of those exercises will be reviewed during the visio-conferencing sessions.

Older supplementary course material

If you have a real “fear of missing out”, you can have a look at the supplementary material that was provided during the previous sessions.

Time requirements

It is difficult to estimate how much time the self-study phase will require. It depends on many factors, including your previous experience with similar subjects, your motivation, your working conditions.

Try to start not too late, in order to have time to ask for help on the course forum if you need, and be ready for the visio-conferencing sessions.

For your information, the schedule of the March 2020 live session was approximately the following:

Older sessions

This section contains links to archived versions of the course material, as well as notes and examples more or less improvized and shown during the previous sessions.

April 2022

November 2021

Notes taken during the visio-conferencing sessions:

April 2021

Examples of functions: http://hub-courses.pages.pasteur.fr/R_pasteur_phd/scripts/Demos_functions.zip

Demonstrating Rmarkdown and some useful tricks with a script: http://hub-courses.pages.pasteur.fr/R_pasteur_phd/scripts/Demo_RMarkdown_April_2021.zip

Some solutions of the descriptive statistics exercises, with more examples of using functions: http://hub-courses.pages.pasteur.fr/R_pasteur_phd/scripts/Solutions_exercises_descriptive_stats.zip

November 2020

Examples of ggplot2 usage to create graphics, presented on the last day: http://hub-courses.pages.pasteur.fr/R_pasteur_phd/scripts/Demonstrations_ggplot2_November_2020.zip

This is provided in the form of an Rmarkdown document. After downloading and unzipping the archive, you can load the .Rmd file in RStudio, and generate the corresponding html document by clicking on the “Knit” button.

June 2020

Notes taken during the visio-conferencing sessions:

Questions, answers and demonstrations presented on the last day:

March 2020

The course support as of March 2020 is archived here.


The course support as it was at the end of the 2018-2019 sessions is archived here.