This course is designed to provide researchers in biomedical sciences with experience in the application of basic statistical analysis techniques to a variety of biological problems.
The course will combine lectures on statistics and practical exercises. The participants will learn how to work with the widely used "R" language and environment for statistical computing and graphics.
Topics covered during the course include: reminders about numerical and graphical summaries, and hypothesis testing; multiple testing, linear models, correlation and regression, and other topics. Participants will also have the opportunity to ask questions about the analysis of their own data.
This course will present all the bioinformatics tools required to analyze RNA-seq gene expression data, from the raw data to the biological interpretation. This two-day course will discuss the following topics:
- Quality control and reads cleanup
- RNAseq reads mapping to genome & transcriptome
- Gene reads counting, gene & exons differential expression
- GO enrichment and pathway analysis
Usage of NGS is increasing in several biological fields due to a very rapid decrease in cost. However, it often results in hundreds of Gbs of data making the downstream analysis very challenging and requires bioinformatics skills.
In this module, we will introduce the most used sequencing technologies and explain their decryption concepts.
We will also introduce the repositories e.g. the European Nucleotide Archive (ENA), Sequence Read Archive (SRA) from which you could retrieve raw data based on specific experiments. We will practice the usage of command line tools to search and fetch NGS raw data in a powerful way.
Finally, using different datasets, we will practice screening for quality control, filtering reads for better downstream analysis, mapping reads to reference genome and visualize the output.
R is a complete and flexible system for statistical analysis which has become a tool of choice for biologists and biomedical scientists, who need to analyze and visualize large amounts of data. One reason for this success is the availability of many contributed packages, which are available freely and can be installed and run directly from R. In bioinformatics, in particular, most published papers include a link to an R package implementing the methods described in the article. This "First Steps with R" course is addressed to beginners wanting to become familiar with the R environment and master the most common commands to be able to start exploring their own datasets.
With a constant evolution of technologies, laboratory biologists are faced with an increasing need of bioinformatics skills to deal with high-throughput data storage, retrieval and analysis.
Although several resources developped for such tasks have a web interface (most of the time, the first choice of biologists), many operations can be more efficiently handled with command lines (CLI).
During the first part of this workshop, researchers and professionals
involved in Big Data management at VitalIT/SIB as well as in Data
Management Plan preparation at UNIL/CHUV will teach you best practices
in data management and how to collect, describe, store, secure and
archive research data. You will be introduced to the need for a Data
Management Plan (DMP) preparation, an evolving document reporting how
the research data will be managed during and after a research project.