Predicting the occurrence of a particular event of interest at future time points is the primary goal of survival analysis. The presence of incomplete observations due to time limitations or loss of data traces is known as censoring which brings unique challenges in this domain and differentiates survival analysis from other standard regression methods. The popularly used survival analysis methods such as Cox proportional hazard model and parametric survival regression suffer from some strict assumptions and hypotheses that are not realistic in most of the real-world applications. To overcome the weaknesses of these two types of methods, in this paper, we reformulate the survival analysis problem as a multi-task learning problem and propose a new multi-task learning based formulation to predict the survival time by estimating the survival status at each time interval during the study duration. We propose an indicator matrix to enable the multi-task learning algorithm to handle censored instances and incorporate some of the important characteristics of survival problems such as non-negative non-increasing list structure into our model through max-heap projection. We employ the L2,1-norm penalty which enables the model to learn a shared representation across related tasks and hence select important features and alleviate over-fitting in high-dimensional feature spaces; thus, reducing the prediction error of each task. To efficiently handle the two non-smooth constraints, in this paper, we propose an optimization method which employs Alternating Direction Method of Multipliers (ADMM) algorithm to solve the proposed multi-task learning problem. We demonstrate the performance of the proposed method using real-world microarray gene expression high-dimensional benchmark datasets and show that our method outperforms state-of-the-art methods.
- Date of publication:
- August 13, 2016
- ACM SIGKDD International Conference on Knowledge Discovery and Data Mining