The functional classification of genes plays a vital role in molecular biology. Detecting previously unknown role of genes and their products in physiological and pathological processes is an important and challenging problem. In this work, information from several biological sources such as comparative genome sequences, gene expression and protein interactions are combined to obtain robust results on predicting gene functions. The information in such heterogeneous sources is often incomplete and hence making the maximum use of all the available information is a challenging problem. We propose an algorithm that improves the performance of prediction of different models built on individual sources. We also develop a heterogeneous boosting framework that uses all the available information even if some sources do not provide any information about some of the genes. We demonstrate the superior performance of the proposed methods in terms of accuracy and F-measure compared to several imputation and integration schemes.
- Date of publication:
- May 1, 2015
- Int. J. of Data Mining and Bioinformatics
- Page number(s):
- Issue Number: