This course will present all the bioinformatics tools required to analyze RNA-seq gene expression data, from the raw data to the biological interpretation. This two-day course will discuss the following topics:

  • Quality control and reads cleanup
  • RNAseq reads mapping to genome & transcriptome
  • Gene reads counting, gene & exons differential expression
  • GO enrichment and pathway analysis

R is a complete and flexible system for statistical analysis which has become a tool of choice for biologists and biomedical scientists, who need to analyze and visualize large amounts of data. One reason for this success is the availability of many contributed packages, which are available freely and can be installed and run directly from R. In bioinformatics, in particular, most published papers include a link to an R package implementing the methods described in the article. This "First Steps with R" course is addressed to beginners wanting to become familiar with the R environment and master the most common commands to be able to start exploring their own datasets.

With a constant evolution of technologies, laboratory biologists are faced with an increasing need of bioinformatics skills to deal with high-throughput data storage, retrieval and analysis.

Although several resources developped for such tasks have a web interface (most of the time, the first choice of biologitsts), many operations can be more efficiently handled with command lines (CLI).

By modeling genome-wide gene expression and chromatin state data in terms of computationally predicted binding sites, Motif Activity Response Analysis (MARA) allows automatic inference of the key regulators, their targets, and their interactions from high-throughput data in any system. In recent years MARA has been completely automated into an integrated system (ISMARA) webserver that allows any researcher to upload their data and obtain comprehensive predictions of key regulatory network structure in their data. The ISMARA system is quite sophisticated and provides users a large number of interactive possibilities to explore predictions and to generate new analyses of the data and even many experienced users are only aware of a fraction of the possibilities that the system provides. We here propose to provide an in-depth interactive workshop of the ISMARA system.

Python is an open-source and general-purpose scripting language which runs on all major operating systems. It was designed to be easily read and written with comparatively simple syntax, and is thus a good choice for beginners in programming. Python is applied in many disciplines and is one of the most common languages for bioinformatics. The Python community enthusiastically maintains a rich collection of libraries/modules for everything from web development to machine learning. Other programming languages such as R have comparable functionality to Python, however some tasks are more natural (and easier!) in Python.