HiperBio

High-Performance Reconfigurable Computing set of tools and methodologies for the hardware development of Bioinformatics and Computational Biology (BCB) applications
Responsable
Période
September 2009 - December 2010

HiperBio is a High-Performance Reconfigurable Computing set of tools and methodologies for the hardware development of Bioinformatics and Computational Biology (BCB) applications. Focusing on their structural parallelization, three industrially-relevant applications will first be implemented, representing a wide spectrum of BCB problem domains. These specific solutions will then be generalized to constitute a library of hardware-oriented blocks performing functions such as sequence alignment, motif search, functional network simulation, etc.

Objectives

  • To establish a basic computational facility for the implementation of algorithms for bioinformatics and computational biology (BCB).
  • To conceive, develop, and implement on hardware three industrially-relevant BCB solutions as a proof of the concept.
  • To develop a set of hardware-oriented structural and/or computational building “blocks” that will serve as the basic bricks for implementing further advanced BCB solutions.
  • To acquire advanced know-how and specific skills in the development of BCB hardware.
 

Methodology

In the framework of this project, we propose to first develop and realize hardware accelerators for three industrially-relevant BCB methods or algorithms. They will be implemented on, at least, one of the two available hardware platforms : a development board integrating a Virtex5 SX50T with PCIe 4x communication connection with the host PC and a multi-FPGA board including seven separate Virtex5 LX50 chips, each with a PCIe 1x connection with the host PC. The three algorithms are the following:

  • PLINK algorithm : PLINK is a whole genome analysis software often referred as the Swiss knife tool in this very field. Among the many possibilities offered, we will especially focus on the Single Nucleotide Polymorphisms (SNPs) association and SNP-epistasis involving millions of pair-wise simple comparisons of vectors, each containing a huge amount of values.
  • Biomarker-based diagnostic decision modeling : A significant current trend in disease diagnostic decisions is based on measuring levels of activity or expression, in the organism, of different substances, the biomarkers. Such a diagnostic decision often depends on complex patterns of activity from many biomarkers. However, measuring these biomarkers is costly and thus, there is a pressure towards finding the minimal set allowing to efficiently diagnose a specific disease. Fuzzy CoCo, the proposed modeling approach, is a cooperative co-evolutionary algorithm which has proved to be able to find highly-predictive candidate systems. Moreover, in addition to their predictive value, the discovered fuzzy models, being linguistic rule-based, also offer an explanation for the possible reasons underlying the diagnostic decisions made.
  • Functional modeling of gene regulatory networks : The functional modeling of gene regulatory networks (GRN) consists of iteratively updating the values of multiple gene expression values within a network of many genes (from hundreds to several thousands) in order to reach a steady or oscillating state. Every gene expression value usually depends on the expression values of several or many other genes in the network, following a more or less complex dynamics. As a result, this iterative update needs an astounding number of evaluations and the complexity of the computation increases exponentially with the number of genes within the network and with the number of different relations linking their level of expression.

In a second phase, from these implementations we will extract some salient principles that could be used to generalize the process of the hardware acceleration of further BCB algorithms. These generalizations will include data transfer and communication requirements, different structural patterns and dataflows specific to BCB algorithms, although with a coars-enough grain to be applicable to many different types of BCB computations.

Finally, based on these principles, we will propose a library of modules, both software and hardware, that should facilitate the implementation of hardware accelerators for many BCB applications.