C3BI icon

Bioinformatics and Biostatistics Hub

Refresher on utilities for high-throughput sequencing data analysis

The most up-to-date version of this page should be located at http://hub-courses.pages.pasteur.fr/refresher_utilities_hts.

The goal of this course is to provide you with a minimal set of theoretical and practical knowledge necessary to use the bioinformatics programs presented in other modules of our bioinformatics courses.

This includes:

Depending on your background, you may wish to skip some of these parts if they are too basic for you.

Preparing for the course

VPN (if the course takes place on-line)

To access the Linux virtual machine on which the practice will take place, if you are outside Pasteur, you need to have the VPN installed and activated.

To install the VPN, refer to the following instructions in the intranet: http://webcampus.pasteur.fr/jcms/c_524335/fr/nouveau-logiciel-d-acces-distant-vpn

You may need to first log in at https://connect.pasteur.fr to be able to access the intranet.

You will have to download the VPN client suitable for your system from the above page too.

If you only see “remote access clients” for smartphones, try to zoom out in your browser to make the correct buttons appear (!).

When the VPN is activated, the virtual machine will be accessible at https://desktop.pasteur.fr.

Installing R and RStudio

There should be R and RStudio already installed in the virtual machine used for this course, but it is recommended that you have them on your computer too, as a backup solution in case of problem, or for your future usage.

To avoid wasting time during this module, install R and RStudio beforehand.

You may refer to the introduction of our more complete R initiation course, and read up until the section about installing RStudio: https://hub-courses.pages.pasteur.fr/R_pasteur_phd/First_steps_RStudio.html#14_installing_rstudio

Create account on Galaxy (galaxy.pasteur.fr)

Refer to the instructions on the following page:

http://hub-courses.pages.pasteur.fr/refresher_utilities_hts/galaxy.html

Read the preliminary introduction about program interfaces

We have prepared some short notes about “program interfaces”: http://hub-courses.pages.pasteur.fr/refresher_utilities_hts/Program_interfaces.html

This should help you get a clearer view of how the different tools and concepts presented during the course relate to one another.

UNIX-like command-line

The course will start with some simple practice to familiarize yourself with basic work using a command-line shell on a UNIX-like system (Linux).

The exercise sheet is here: http://hub-courses.pages.pasteur.fr/refresher_utilities_hts/Shell_practice.html

And here is a list of useful special characters that you might need to copy-paste in case of keyboard issues in the virtual machine: http://hub-courses.pages.pasteur.fr/refresher_utilities_hts/main_chars.html

R basics

The course will then use the beginning of our course material about R: https://hub-courses.pages.pasteur.fr/R_pasteur_phd/First_steps_RStudio.html#2_Smooth_introduction_to_R

After reaching the part about scripts, we will switch to the following practice activity: http://hub-courses.pages.pasteur.fr/refresher_utilities_hts/R_script_practice.html

Galaxy

The last part of the course will consist in a presentation and some exercises on the galaxy platform: http://hub-courses.pages.pasteur.fr/refresher_utilities_hts/galaxy.html