Archives 2008

The course attempts to present the state of the art of bioinformatics methods and resources for studying transcriptional regulation. The practical part is largely based on databases and web-servers developed by our group and can be viewed as an introduction to the usage of these tools:

  • The Eukaryotic Promoter Database EPD: Retrieval of useful and reliable information on promoters, and how to use this information for studying transcriptional regulatory mechanisms.
  • CleanEx: How to access, combine and effectively use public gene expression data.
  • HTP SELEX: A database of quantitative predictive models for transcription factor binding sites.
  • Signal Search Analysis (SSA): Motif discovery in functionally related DNA sequences and new tools to study the architecture of gene regulatory regions.

The theoretical part is aimed at providing an overview of public data and computational methods relevant to the understanding of gene regulation. Particular emphasis will be on the analysis of so-called mass genome annotation (MGA) data such as CAGE for mapping transcription start sites, or ChIP-seq for mapping in vivo transcription factor binding sites.

The aim of this workshop is to give an insight in the field of protein functions through the use of both computationally derived predictions and large to medium scale protein interactions screen (e.g. yeast two hybrid, tap-tag-immunoprecipitations, siRNA screen). Traditionally protein functions has been assigned through the use of domain identification, enzymatic activity etc. With the wealth of data provided by the mass sequencing of several genomes this allows the development of several methodologies that could infer functional protein interacting partners [Eisenberg 2000 Nature]. However using these types of data became difficult, this required graph-based approaches, and several tools have been developed. The aim of the workshop is to provide the first step in the use of these tools and the identification of the different resources providing protein-protein interaction data, functional predictions.

Protein sequences and protein domain analyses have become standard in silico resources for molecular biologists. Behind database searches with the blast program, and the organization of proteins into domains as provided by InterPro, there exist many other methods to investigate the different aspects of protein sequences: their modular organization, their classification and the relationships between structure and function. The purpose of this workshop is to provide insight into these methods.

One general principle will be promoted during this workshop, one deals with groups of proteins. Aligned protein families, or protein sub-sequences (domain), or sets of unaligned sequences, always contain more information than individual sequences.

Domain hunting methods (PSI-blast, Profile search, profile-HMM) are "classical" but powerful methods for characterizing protein domains. These require that the sequences, or parts thereof (a domain), be arranged into a multiple sequence alignment. An introduction to these methods, and exercises will be given in the morning using the MyHits web server.

There exist several methods that don't require the protein sequences be arranged into a multiple sequence alignment as a prerequisite to any analysis. These methods can be used to automatically classify sequences and to detect conserved "diagnostic" motifs. An introduction to these lesser known techniques and exercises will be given in the afternoon.

Introduction to Bioinformatics - Practicals

Using web-based tools for Bioinformatics research is very convenient for small amounts of data (e.g. less than 20 sequences). As soon as researchers want to analyse more sequences (from 20 up to thousands sequences or megabytes of data), the task becomes tedious. The course aims at providing the basic knowledge of PERL scripting abilities to analyse large amounts of data by automating recurring tasks or by grouping several tasks into one program. This course is organised for biomedical researchers and no particular programming knowledge is required, however some basic Unix/Linux computing skills would be of great advantage.

Data analysis by traditional methods is done by a researcher using several tools often available through the Internet. However, the user faces data conversion, copy-paste, or interactive waiting time issues when accessing diverse web sites. Additionally, the sequence of events or analysis tools might require repetitive tasks. This leads to the idea of automated "workflows" that could be reused by other researchers. This requires adaptation both for the service provider and the end-user. A current trend on the side of service providers is to use Web Services technologies to achieve the aim of easy data exchange. In particular, this technology provides standardization and programmatic access to databases and tools allowing for automated design of analysis pipelines. This course emphasizes the use of Web Services in order to build biological analysis pipelines (i.e., Bioinformatics workflows). Requirements: No UNIX or programming skills required. Basic bioinformatics tools and database knowledge is useful.

Introduction to Bioinformatics
The rapidly increasing number of nucleotide and amino acid sequence data has become a major source of information for biomedical researchers. Knowing how to use appropriate software tools running on powerful computers is a necessity for biologists wanting to identify new genes or new patterns. Too often, students in biology do not even know that these tools are available. Fortunately, the emergence of web interfaces to access databases and analyze sequences has made the field accessible for most scientists.
We think that this introductory course is an important part of the training of every student in biology or biochemistry.

This workshop first aims at introducing the homology modeling techniques. This approach is a method for constructing a theoretical model of the 3D structure of a protein at the atomic level, based on its sequence and the experimentally determined structures of similar proteins.
In addition, an introductory course to the design of high quality images for publication will be given.