MultiMod

Overview

Most common diseases, including allergy, cancer and diabetes, are complex. The genetic susceptibility of an individual to such a disease is not the result of a single causative gene, but rather altered interactions between multiple genes. In many cases, DNA microarray studies have implicated hundreds of genes. Moreover, there is considerable individual variability. A clinical consequence of all this is variable response to medication, which increases both suffering and cost. Physicians should ideally be able to personalize medication routinely, based on measurements of but a few protein markers. The identification of such markers is thus an important goal, but also a formidable challenge, and one that requires understanding complex pathogenic mechanisms and how they vary across populations. It is possible that the recent advances in high-throughput genomics, computer science, bioinformatics and systems biology outlined below could contribute to such understanding (Fox JL. Nature Biotechnol 2007).

In this project we hypothesize that markers for individualized medication can be identified by network-based analysis of gene expression arrays. Our approach is outlined as follows:

  1. Disease-associated genes are identified and organized into putative interaction networks*.
  2. Networks are dissected to find modules of genes with distinct biological functions.
  3. Modules are further decomposed to elucidate putative pathways and individual genes with key probable regulatory functions.
  4. The transcriptomal modules are expanded to include other layers ranging from DNA to protein. This is done by adding data from complementary high-throughput experiments. The ultimate aim is to obtain multi-layer modules (MLM) that include information about all layers and regulatory elements.
  5. Protein markers for individual variations are extracted from those modules.
  6. Markers are tested diagnostically for personalized medication in patients.
  • *In the initial analysis these networks include all forms of interactions between gene products.

The project is facilitated by the recent development of high-throughput methods to analyse SNPs, proteins and different regulatory elements such as microRNAs and DNA methylation. Moreover, because the layers and elements are interdependent, an analysis of dependencies can be used for step-wise cross-validation (for example, altered mRNA expression due to regulatory SNPs).

On the other hand, this project is faced with several noteworthy challenges: a) the heterogeneity of complex diseases, and in many cases very little is known about causal mechanisms; b) the difficulties in finding representative study models; c) methodological problems involved in the development of computational and bioinformatics methods to build modules; d) experimental validation of disease mechanisms that may involve great numbers of genes, many of which have unknown or poorly defined functions.

The effort is based on ongoing multi-disciplinary collaborations between clinically active researchers and leading experts in genomics, systems biology, computer science, bioinformatics and statistics.

Concept and Objectives

  • to develop and apply methods to form multi-layer modules in a complex disease
  • to analyse the modules to understand disease mechanisms and individual variations,
  • to find protein markers of those variations,
  • to apply the markers diagnostically in an effort to predict treatment response, and
  • to make the resultant bioinformatics methods widely available in a standardized form (e.g., as web-based tools) in order to facilitate other studies of complex diseases.

The development and application of methods draws on the applicants’ experiences of genome-wide association studies (Sladek et al. Nature 2007), network models of gene expression data (Jenssen et al. Nat Genet 2001, Voy et al. PLoS Comput Biol 2006) and linking such models to genetic variations (Chesler et al. Nat Genet 2005). These methods are further developed on allergen-challenged lymphocytes from patients with seasonal allergic rhinitis (SAR). This is an optimal disease model because it is common, well-defined, has a known external cause (pollen) and can be analyzed in both experimental and clinical studies. The analytical methods are, however, likely to be applicable to other complex diseases. The allergen challenged cells are analysed with DNA microarrays to identify transcriptomal modules responding to the challenge. SNPs in regulatory regions of key genes in the modules are studied, and the corresponding proteins analysed, in supernatants as well as in nasal fluids as diagnostic markers (Benson et al. J Allergy Clin Immunology 2006). The timetable calls for delivery of the first multi-layer modules after the first 18 project months based on a transcriptomal template from DNA microarray studies. Onto this template is added data about regulatory elements such as transcription factors, microRNAs and DNA methylation. These elements are analyzed on a genome-wide scale, and this information will be used to examine selected SNPs and proteins. During the next six months, selected genes will be examined experimentally. For genes with unknown roles, this will involve combined bioinformatics and RNAi studies as described by the participants (Murali et al. Nat Biotechnol 2006, Sonnichsen et al. Nature 2005, Echeverri et al. Nat Methods 2006, Nat Rev Genet 2006). The aim is to define one or more disease modules that are general to all patients with seasonal allergic rhinitis. During year three, large-scale studies will be performed to define individual variations in such modules, ranging from SNPs to proteins. In parallel, the original general modules will be refined using novel methods to analyse other layers and regulatory elements. During year four, clinical studies will be performed to test if selected protein markers can be used to predict treatment response. In addition, the analytical methods will be made available on the Internet in a standardized format for studies of other complex diseases.

Progress beyond the state-of-the-art

State of the art

Complex diseases like diabetes, allergy and cancer depend on altered interactions between large numbers of genes, many of which do not belong to known disease mechanisms. Genome-wide association studies performed by one of the applicants have for example described such genes in diabetes (Sladek et al. Nature 2007). In allergic disease DNA microarray have shown changes of expression of hundreds of genes (Benson et al. J Allergy Clin Immunol 2004, 2006). Emerging high-throughput technologies indicate disease-associated changes in other layers and regulatory elements, for example copy number variants, DNA methylation and microRNAs (Hardiman et al. Pharmacogenomics 2006). On top of this complexity there is considerable individual heterogeneity. A clinical consequence of this is variable response to treatment, which increases both suffering and cost. Personalized medication has therefore been highlighted as a priority. At present, however, there are only a few examples that have reached the clinic (Fox JL. Nat Biotechnol 2007).

One approach to functionally understand gene expression changes in complex diseases may be to change the scale from individual genes to groups of functionally related genes. Such genes may be identified with bioinformatics methods, like cluster analysis, that group genes whose expression levels correlate. These clusters can be used for classification, for example of different lymphomas (Alizadeh et al. Nature 2000). However, the analysis in itself does not give any functional understanding of disease mechanisms. One possibility to obtain such understanding is to search the gene expression data for genes known to belong to specific pathways (Benson et al. Cytokine 2002). A problem with this is that complex diseases often involve multiple interacting pathways, which may be difficult to separate from each other. Rather, they form sub-networks or modules. Such modules have been identified in studies of cancer and functionally annotated (Segal et al. Nat Genet 2005). Networks provide a compelling framework to organize and functionally understand complex systems (Barabasi et al. Nat Rev Genet 2004, Mustacchi et al. Yeast 2006). In the context of gene expression data in human cells, network-based analysis has been applied to form networks of interacting genes and dissect those networks to find modules and pathways (Jenssen et al. Nat Genet 2001, Calvano et al. Nature 2005). The same analytical methods have also been applied to human disease and used to go from modules to individual genes (figure 1). The corresponding proteins have been tried as diagnostic markers in human disease (Benson et al 2006. J Allergy Clin Immunol 2006). The latter study was also based on computational methods developed by one of the applicants in studies of inbred mice, in which altered gene expression patterns data were used to find genetic variants that caused those alterations (Chesler et al. Nat Genet 2005). Linking gene expression changes to genetic variants has also been performed in human cells (Bystrykh et al. Nat Genet 2005). These studies show how changes in a transcriptional module can be used as a template for further studies:

  • to find the corresponding changes in other layers (in the examples above the DNA and protein layers)
  • since the layers are interdependent these dependencies can be used to cross-validate findings in different layers and build multi-layer modules (MLM) that include data ranging from DNA to protein.
  • such MLM can be used clinically, to find diagnostic protein markers. To our knowledge this has not been previously performed. Most high-throughput analyses of complex diseases focus on one layer and rarely perform clinical or experimental validation studies. This would require solving several and diverse problems that are outlined below:

Problems in high-throughput studies of complex diseases

  • a) Finding an optimal disease model for large-scale studies. Many complex diseases are heterogenous or have unknown or diverse causes (e.g. cancer). The disease-causing cells or tissues may only be partially known or not readily accessible in humans (e.g. stroke).
  • b) Many genes have unknown or partially known functions
  • c) The need to develop computational and bioinformatics methods to build multi-layer modules
  • d) The need to assess goodness-of-fit taking into account the large number of possibilities, which can lead to problems of multiple testing and over-fitting to noisy data.
  • e) Experimental validation of disease mechanisms that may involve hundreds of genes, many of which have unknown or poorly defined functions

In this project we address these problems as follows:

  • a) We focus on seasonal allergic rhinitis (SAR), because it is common, relatively homogenous and has a known external cause (pollen). It is possible to reduce heterogeneity by studying unique materials, such as concordant and discordant monozygous twins. The main disease-causing cell, CD4 + cells, is known. We use two experimental models; 1) allergen-challenged CD4 + cells from patients and controls are analyzed with high-throughput methods to find modules, pathways and key regulatory genes as described in figure 1 (Benson et al. Genes Immun 2006). In addition, these cells are used for experimental studies of individual genes; 2) a mouse model of allergy in which wild type and knockouts are compared is also used for functional studies of individual genes (Benson et al. J Clin Invest 2007, submitted). In order to study human allergic inflammation in vivo and test diagnostic markers we examine nasal fluids and biopsies as well as skin from patients and controls before and after allergen challenge.

Figure 1. Network-based analysis of DNA microarray data. A) Genes identified by DNA microarray analysis of allergic disease (red) are mapped on to an interaction network formed by all human genes (grey). B) Modules of interacting genes that represent disease-associated biological functions are identified C) A module is dissected to find a pathway D) The pathway is analysed to find a putative disease-causing gene (in this case exemplified by an up-stream gene)

  • b) the roles of genes with unknown or partially known functions are defined using a combination of bioinformatics and high-throughput RNAi as described by the applicants (Murali et al. Nat Biotechnol 2006, Sonnichsen B et al. Nature 2005, Echeverri CJ et al. Nat Methods 2006, Nat Rev Genet 2006). In addition text mining algorithms through the customization of the one of the partners technology (Jenssen et al. Nat Genet 2001) will provide predictions of gene functions in a context-specific way.
  • c) novel statistical algorithms will be developed as described in Workpackage 4.
  • d) we develop and integrate state-of-the art methods to build MLM; combinatorial algorithms to find putatively co-regulated genes (Chesler et al. Nat Genet 2005), organize those genes into modules using network models of the human interactome from results of context specific text mining algorithms (Jenssen et al. Nat Genet 2001) and manual curation (Calvano et al. Nature 2005) as well as other bioinformatics sources.
  • e) Validation studies are performed with a combination of experimental, genomic and bioinformatics methods. Examples include blocking experiments with antibodies or RNAi.

Progress

To our knowledge this is the first project that aims to define multi-layer modules (MLM) in a complex disease and use them for a clinical goal, to personalize medication. This involves development and integration of novel computational and bioinformatics methods based on a systems biological framework. If successful, the project may serve as a model for studies of other complex diseases. The analytical methods will be made available on the Internet in a standardized format for such studies. The project may also increase understanding of the relative role of different layers and elements, as well as of genes with presently unknown functions in complex diseases.

Consortium overview

Application of systems biology and high-throughput genomics to solve a concrete clinical problem, i.e. to personalize medication is a formidable challenge. This requires integration of many different forms of expertise and complex analytical methods that are applied to diverse materials such as human cells and knockout mice. Some of the patient groups are hard to find (e.g. monozygous twins) or need to be examined more than once. Finally, the results need to be validated in clinical studies to see if treatment response can be predicted. Since there is limited experiences of analysing data of such diversity and complexity the integrated development and application of new computational, bioinformatics and statistical solutions is required.

In order to build MLM all the high-throughput experiments must be performed on the same individuals. Therefore the clinical materials are obtained in one clinical research centre (The Unit for Clinical Systems Biology/The Unit for Pediatric Allergology, The Queen Silvia Children’s Hospital, Göteborg, Sweden, WP1). The Unit for Clinical Systems Biology is headed by a clinician, Dr Mikael Benson, who coordinates this project. MB has a six-year grant from the Swedish Research Council as a senior researcher that is combined with his position as a senior consultant in paediatric allergology at Queen Silvia Children’s Hospital. Obtaining rare patient materials is simplified by patient registries, such as the Swedish Twin Registry (figure 2). All high-throughput experiments are performed either at the Genomics Core Facilities in Göteborg and Oslo, or by Cenix Bioscience, a biotech SME with expertise in RNA interference (WP1, WP3 and WP5, respectively)

A team of leading international experts has been assembled to analyse the data. Professor Michael A. Langston and his group of post docs and PhD students at the Department of Electrical Engineering and Computer Science at the University of Tennessee were the first to harness fixed-parameter tractability in order to develop pioneering clique-centric methods that find modules of putatively co-regulated genes in gene expression data (Abu-Khzam et al. Algorithmica 2006), link these modules to genetic variation using quantitative trait loci (Chesler et al. Nat Genet 2005), and synthesize from these methods novel topological differential analysis tools (Voy et al. PLoS Comput Biol 2006). ML is also an expert on combinatorial algorithms for the integration and analysis of biological data of large scale and wide diversity (Kirova et al. CAMDA 2006; Zhang et al. Supercomputing 2005). This work is facilitated by state-of-the-art supercomputers at the nearby Oak Ridge National Laboratory, where ML is a Collaborating Scientist in the Systems Genetics Group within the Biosciences Division (WP2).

Functional annotation of modules is done in WP3 by Professor Eivind Hovig and associates using their PubGene co-citation literature network (Jenssen et al. Nat Genet 2001). An advantage of PubGene is that modules are not restricted by canonical gene interaction pathways that have been described in healthy cells. It may therefore be particularly suitable to functionally annotate disease modules that may have gene interactions that differ from those in healthy cells. Another advantage is that PubGene is combined with other data sources to provide cell-specific and multi-layer network information. Additionally PubGenes’ algorithms can be modified in a customized fashion to adjust to the context specific needs of the complex diseases to be analyzed clinically. EH is an expert in fields including the bioinformatics of biomedical text mining, microarray analysis, and annotation based comparisons of DNA features. EH has both a wetlab group within cancer and gene silencing, and a bioinformatics group.

Figure 2. Analysis of selected patient groups, such as monozygous twins, under controlled conditions reduces the complexity of the project. The photo shows a nurse at the Unit for Clinical Systems Biology obtaining blood samples from twins for in vitro allergen-challenge of CD4 + lymphocytes.

An important challenge in the project is that high-throughput technologies generate data where the number of variables greatly exceeds the number of observation. This requires the development and application of new statistical methods, which will be undertaken in WP 4 by Professor David J. Balding and Dr Lachlan J.M. Coin of the Centre for Biostatistics at Imperial College London. DJB has a PhD in applied probability from Oxford, and since graduating has worked to apply mathematical, computational and statistical methods to solve problems in biology and medicine, particularly in population, evolutionary and medical genetics. Recently he has been active in developing and applying novel statistical methods for the analysis of genetic association data, particularly for genome-wide association studies (Sladek et al. Nature 2007) These methods have tackled problems of confounding by population structure, exploiting haplotype clustering to strengthen signals of association, and simultaneous analysis of large numbers of genetic variants (Balding et al. Nat Rev Genet 2006). LJMC has a PhD in Bioinformatics from the Wellcome Trust Sanger Institute and Cambridge University, and has worked in comparative genomics, phylogenetics, microarray analysis, identification of copy-number variants and haplotype clustering methods based on Hidden Markov Models (Coin et al. PNAS 2003, Futreal PA et al. Nat Rev Cancer 2004).

The role of disease-associated genes identified through the methods described above, will be confirmed experimentally using high-throughput RNAi (HT-RNAi). Cenix BioScience GmbH, represented by Drs. Christophe Echeverri (founder, CEO/CSO) and Birte Sönnichsen (COO) (WP5) were the first to pioneer high throughput applications of RNAi (HT-RNAi), carrying out the first comprehensive genome-wide RNAi screen (Sönnichsen et al, Nature 2005) for genes involved in early embryogenesis of the nematode worm C. elegans, where RNAi was first described in 1998 by Fire and Mello’s Nobel Prize-winning work (Fire et al, Nature 1998). Since 2001, Cenix has focused entirely on further developing and applying the power of genome-scale HT-RNAi with high content, multi-parametric assays using automated microscopy in a wide range of human and rodent cultured cell models. As such, Cenix has since established itself as a global leader in exploiting this technology in collaboration with both academic and pharmaceutical research groups, to advance a wide range of basic research and applied disease fields including oncology as well as metabolic, infectious and cardiovascular diseases (e.g. Sachse et al. Oncogene 2004). Along the way, Cenix has also led the way in defining the combination of computational and automated laboratory analysis infrastructures required to drive these advanced functional genomics applications (Sachse et al. Methods Enzymol 2005; Echeverri et al. Nat Rev Genet 2006 and Nat Methods, 2006). CE has a PhD in cell biology from the University of Massachussetts (Worcester, MA), and his postdoctoral work at the European Molecular Biology Laboratory (EMBL, Heidelberg, Germany) pioneering the genome-scale use of RNAi formed the basis for founding Cenix. BS has a PhD in cell biology from the University of Göttingen (Göttingen, Germany), and joined Cenix as one of its first senior scientists following her postdoctoral work at Cancer Research UK (formerly, the ICRF) in London and the EMBL, (Heidelberg, Germany).

Individual participants

Workpackage 1.

The Unit for Clinical Systems Biology (UCSB)/Unit for Paediatric Allergology (UPA), Queen Silvia Children’s Hospital, Göteborg, Sweden.

Organization and funding: The UCSB is part of the Centre for Systems Biology in Göteborg, which links several groups within the medical, natural science and technological faculties. This local network extends internationally through EC funded projects with a total budget of 15 million Euro. The UCSB is supported by the Swedish Research Council, the European Commission and Göteborg University. The total funding is some 2 million Euro over the next three years. Key investigators and their roles: Mikael Benson is the coordinator of the project and the head of the UCSB. He and co-workers are responsible for gathering the clinical materials and performing the experimental and genomic studies in collaboration with the UPA, the Department of Immunology and the Bio-X-Med genomics unit, as well as with the applicants in WP3 and WP5. MB is a senior consultant in pediatric allergology at the UPA. He has a six-year grant as a senior researcher from the Swedish Research Council to apply high-throughput technology and systems biology to find markers for personalized medication.

Five relevant publications
  1. Benson M, Langston MA, Adner M, Andersson B, Torinssson-Naluai A, Cardell LO. A network-based analysis of the late-phase reaction of the skin. The Journal of Allergy and Clinical Immunology 2006;118(1):220-5.
  2. Benson M, Carlsson L, Guillot G, et al. A network-based analysis of allergen-challenged CD4+ T cells from patients with allergic rhinitis. Genes and immunity 2006;7(6):514-21
  3. Benson M, Breitling R. Network theory to understand microarray studies of complex diseases. Current Molecular Medicine 2006;6(6):695-701
  4. Benson M, Carlsson L, Adner M, et al. Gene profiling reveals increased expression of uteroglobin and other anti-inflammatory genes in glucocorticoid-treated nasal polyps. The Journal of Allergy and Clinical Immunology 2004;113(6):1137-43
  5. Benson M, Carlsson B, Carlsson LM, et al. DNA microarray analysis of transforming growth factor-beta and related transcripts in nasal biopsies from patients with allergic rhinitis. Cytokine 2002;18(1):20-5

Workpackage 2.

The Department of Electrical Engineering and Computer Science, University of Tennessee, USA

Organization and funding: The University of Tennessee is a comprehensive land-grant university and a Carnegie I Research Institution. Knoxville is the flagship campus of the University of Tennessee system. Students enroll from every state in the nation and over 100 foreign countries. Oak Ridge National Laboratory is the Department of Energy’s primary institution for high performance computing. It has distinctive capabilities in biological science, materials science, neutron science and many other areas. ORNL employs roughly 1500 scientists and engineers and covers a total of 58 square miles. Dr. Langston regularly consults at and maintains accounts on ORNL’s vast assortment of state-of-the-art clusters, supercomputers and mass storage systems. His team is funded 75% time thorough UT and 25% time through grants on which he serves as PI or co-PI.

Key PIs and their roles: Professor Langston leads a team of students, post doctoral fellows and research associates whose work is focused on efficient algorithm design, analysis and high performance implementations, with a special emphasis on applications to computational biology. He also serves as Collaborating Scientist at ORNL, where he maintains offices in the Biosciences Division and regularly consults in the Computer Science and Mathematics Division, the Chemical Sciences Division, the Joint Institute for Computational Science and the Computational Biology Institute. He is currently in the process of developing portals through which the community at large may access his team’s computational tools. His work in developing ClustalXP is a prominent example. Professor Langston has authored over 200 refereed publications, including those in journals relevant to this project such as Nature Genetics, PLoS Computational Biology, Journal of the ACM, and Journal of Allergy and Clinical Immunology. He is perhaps best known for his long-standing work on combinatorial algorithms, complexity theory and design paradigms for sequential and parallel computation. In addition to maintaining his research program, he regularly teaches courses on algorithmic analysis, computational and systems biology, discrete optimization, graph theory and related subjects. His research has been funded in the U.S. by the National Science Foundation, the Department of Defense, the Department of Energy, the National Institutes of Health and a variety of other agencies. It has been funded abroad by the Australian Research Council and the European Commission. He has received numerous awards, most recently the Distinguished Service Prize from the Association for Computing Machinery Special Interest Group on Algorithms and Computation Theory.

Five relevant publications
  1. F. N. Abu-Khzam, M. A. Langston, P. Shanbhag, and C. T. Symons, Scalable parallel algorithms for FPT problems, Algorithmica, vol. 45, 2006, 269-284.
  2. M. Benson, B. Andersson, M. Adner, Å.Torinssson-Naluai, M. A. Langston, and L. O. Cardell, A Network-Based Analysis of the Late Phase Reaction of the Skin, Journal of Allergy and Clinical Immunology, vol. 118, 2006, 220-225.
  3. E. J. Chesler, L. Lu, S. Shou, Y. Qu, J. Gu, J. Wang, H. C. Hsu, J. D. Mountz, N. E. Baldwin, M. A. Langston, J. B. Hogenesch, D. W. Threadgill, K. F. Manly, and R. W. Williams, Complex trait analysis of gene expression uncovers polygenic and pleiotropic networks that modulate nervous system function, Nature Genetics, vol. 37 (3), 2005, 233-242.
  4. M. A. Langston, L. Lan, X. Peng, N. E. Baldwin, C. T. Symons, B. Zhang, and J. R. Snoddy, “A combinatorial approach to the analysis of differential gene expression data: the use of graph algorithms for disease prediction and screening,” in Methods of Microarray Data Analysis IV, Papers from CAMDA '03, K. F. Johnson and S. M. Lin, Eds. (Boston: Kluwer Academic Publishers, 2005) 223-238.
  5. B. H. Voy, J. A. Scharff, A. D. Perkins, A. M. Saxton, B. Borate, E. J. Chesler, L. K. Branstetter, and M. A. Langston, Extracting gene networks for low dose radiation using graph theoretical algorithms, PLoS Computational Biology, vol. 2 (7), 2006

Workpackage 3.

The Department of Informatics at the University of Oslo, Norway

Organization and funding: The research group is integrated within several institutions, The Rikshospitalet-Radiumhospitalet university clinic, and the Institute of Informatics at the university of Oslo. The group also runs a bioinformatics core facility for both institutions, funded by them. Funding is also obtained from the Functional genomics program of Norway. The group is integrated into the national bioinformatics framework. The group is active in both text mining, statistical mechanics of DNA and a number of aspects related to microarrays.

Key investigators and their roles: Eivind Hovig, professor at Department of Informatics at the University of Oslo, Norway, holds positions as a group leader at the Institute for Cancer Research, and as section head at the Department of medical informatics at The Norwegian Radium Hospital. He leads two research groups (a wetlab group and a drylab group) of students, post doctoral fellows and research associates whose work is focused on development and implementation of high-throughput techniques for genomics and clinical bioinformatics. He has coauthored some 85 articles in journals including Nature Genetics, Nature Biotechnology, PNAS and Lancet, has authored 7 patents and has received awards for scientific excellence as well as inventor prices for his work. He is the leader of the FUGE functional genomics platform of bioinformatics in the Oslo region, and member of the board of the national FUGE bioinformatics steering group. He is chief scientific officer of the Norwegian based bioinformatics company, PubGene Inc., based on a patent of Hovig, Jenssen et al., and is engaged as adviser in two other companies.

Five relevant publications
  1. Liu F, Tostesen E, Sundet JK, Jenssen TK, Bock C, Jerstad GI, Thilly WG, Hovig E. The human genomic melting map. PLoS Comput Biol. 2007 May 18;3(5):e93. Epub 2007 Apr 11.
  2. Kuo WP, Liu F, Trimarchi J, Punzo C, Lombardi M, Sarang J, Whipple ME, Maysuria M, Serikawa K, Lee SY, McCrann D, Kang J, Shearstone JR, Burke J, Park DJ, Wang X, Rector TL, Ricciardi-Castagnoli P, Perrin S, Choi S, Bumgarner R, Kim JH, Short GF 3rd, Freeman MW, Seed B, Jensen R, Church GM, Hovig E, Cepko CL, Park P, Ohno-Machado L, Jenssen TK. A sequence-oriented comparison of gene expression measurements across different hybridization-based technologies. Nat Biotechnol. 2006 Jul;24(7):832-40.
  3. Nygaard V, Holden M, Loland A, Langaas M, Myklebost O, Hovig E. Limitations of mRNA amplification from small-size cell samples. BMC Genomics. 2005 Oct 27;6:147.
  4. Wang J, Bo TH, Jonassen I, Myklebost O, Hovig E. Tumor classification and marker gene prediction by feature selection and fuzzy c-means clustering using microarray data. BMC Bioinformatics. 2003 Dec 2;4:60.
  5. Jenssen TK, Laegreid A, Komorowski J, Hovig E. A literature network of human genes for high-throughput analysis of gene expression. Nat Genet. 2001 May;28(1):21-8.

Workpackage 4.

The Department of Epidemiology and Public Health Imperial College, St Mary's Campus, London, UK

Organization and funding: The Centre for Biostatistics at Imperial College (http://www.icbiostatistics.org.uk/), jointly led by Professors David Balding and Sylvia Richardson, includes around 10 academic staff, 20 postdocs and 15 research students, based in several departments but primarily within the Department of Epidemiology and Public Health at the St Mary’s Hospital campus. The research of the Centre is focussed on using advanced biostatistical methods to enhance population health. It has major strengths in the statistical analysis of genetic association studies and gene expression studies, particularly using highly-structured Bayesian modelling implemented via intensive stochastic simulation techniques. The latter exploit a local computer farm within the Department as well as the Imperial High Performance Computing centre. The work of the Centre is funded by UK research councils (MRC, BBSRC, EPSRC, ESRC), research charities (Wellcome Trust, BHF), the European Union and the US NIH, and industry (GSK, Astrazeneca).

Key PIs and their roles: DJB is Professor of Statistical Genetics and LJMC is Senior Research Fellow, and both contribute to leading a team of around 12 postdocs and PhD students. LJMC will be the primary supervisor of the postdoc, and together they will deliver the principal research outcomes of the workpackage. DJB will offer and advice and overall supervision, and co-ordinate contributions from other members of the Centre for Biostatistics when their specialist expertise is required for the project. In addition to the deliverables specified above, the Imperial College team will provide specialist biostatistical advice to the MultiMod project partners as needs arise during the collaboration.

Five relevant publications
  1. A genome-wide association study identifies novel risk loci for type 2 diabetes, Sladek R, Rocheleau G, Rung J, Dina C, Shen L, Serre D, Boutin P, Vincent D, Belisle A, Hadjadj S, Balkau B, Heude B, Charpentier G, Hudson D, Montpetit A, Pshezhetsky A, Prentki M,Posner B, Balding D, Meyre D, PolychronakosC, Froguel P, Nature 445, 881-885, 2007
  2. A tutorial on statistical methods for population association studies, Balding DJ, Nature Reviews Genetics 7: 781-791, 2006. doi:10.1038/nrg1916
  3. Clinical factors and ABCB1 polymorphisms in prediction of antiepileptic drug response: a prospective cohort study, Leschziner G, Jorgensen AL, Andrew T, Pirmohamed M, Williamson PR, Marson AG, Coffey AJ, Middleditch C, Rogers J, Bentley DR, Chadwick DW, Balding DJ, Johnson MR, Lancet Neurology 5: 668–76, 2006.
  4. Logistic regression protects against population structure in genetic association studies, Setakis E, Stirnadel H, Balding D, Genome Research, 16: 290-296, 2006
  5. Improved techniques for the identification of pseudogenes. Coin L, Durbin R, Bioinformatics 20 Suppl 1:I94–I100, 2004.

Workpackage 5.

Cenix BioScience GmbH

Organization and funding: Cenix BioScience GmbH (based in Dresden, Germany: http://www.cenix-bioscience.com) is a privately held biotechnology company founded in 1999, specializing in advanced, cell-based applications of RNAi for the discovery and functional characterization of novel therapeutic targets, biomarkers and drug candidates. Led by Drs. Christophe Echeverri (lead founder, CEO/CSO) and Birte Sönnichsen (COO), Cenix counts 34 staff, 28 of which form the scientific team combining both laboratory and IT operations that drive the company’s core offerings: advanced research services. The revenues from these RNAi-based discovery projects with both pharma (e.g. Bayer, Schering, Merck KgaA, etc..) and academic clients, have secured the company’s continued profitability for the last 4 years running. Cenix has also maintained active R&D activities to further develop its technology base both on its own and within research consortia supported by 7 major grants from German federal granting programs.. Cenix scientists have successfully completed genome-scale RNAi screening projects using various human and rodent cell-based models in a wide range of disease areas including oncology, as well as metabolic, cardiovascular and infectious diseases. Cenix particular area of specialisation is combining HT cell-based applications of RNAi with high content, multi-parametric readout assays, to extract the richest possible analyses from these datasets, with maximal patho-physiological relevance.

Key PIs and their roles: Dr. Maria Mirotsu has lead major RNAi studies for clients and several internal RNAi optimization studies in disease relevant cell systems at Cenix, including primary B and T lymphocytes and the monocyte cell line U937.

Five relevant publications
  1. Sönnichsen B, Koski LB, Walsh A, Marshall P, Neumann B, Brehm M, Alleaume A-M, Artelt J, Bettencourt P, Cassin E, Hewitson M, Holz C, Khan M, Lazik S, Martin C, Nitschke B, Ruer M, Stamford J, Winzi M, Heinkel R, Röder M, Finell J, Häntsch H, Jones SJ, Jones M, Piano F, Gunsalus KC, Oegema K, Gönczy P, Coulson A, Hyman AA, Echeverri CJ. Full genome RNAi profiling of early embryogenesis in C. elegans. 2005. Nature 434: 462-9.
  2. Gunsalus KC, Ge H, Schetter AJ, Goldberg DS, Han JD, Hao T, Berriz GF, Bertin N, Huang J, Chuang LS, Li N, Mani R, Hyman AA, Sönnichsen B, Echeverri CJ, Roth FP, Vidal M, Piano F. Predictive models of molecular machines involved in Caenorhabditis elegans early embryogenesis. 2005. Nature 436: 861-5.
  3. Pelkmans L, Fava E, Grabner H, Hannus M, Habermann B, Krausz E and Zerial M. Genome-wide analysis of human kinases in clathrin- and caveolae/raft-mediated endocytosis. 2005. Nature 436: 78-86.
  4. Echeverri CJ, Perrimon N. High-throughput RNAi screening in cultured cells: a user’s guide. 2006. Nature Reviews Genetics 7: 373-84.
  5. Echeverri CJ, Beachy PA, Baum B, Boutros M, Buchholz F, Chanda SK, Downward J, Ellenberg J, Fraser AG, Hacohen N, Hahn WC, Jackson AL, Kiger A, Linsley PS, Lum L, Ma Y, Mathey-Prévôt B, Root DE, Sabatini DM, Taipale J, Perrimon N & Bernards R. Minimizing the risk of reporting false positives in large-scale RNAi screens. 2006. Nature Methods 3(10):777-9.
start.txt · Last modified: 2009/07/08 16:24 by Fredrik Barrenäs
 
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki