Machine learning (ML) has been gaining interest in the metabolic engineering community as a means to automate prediction tasks. In this work, we introduce and study the task of using ML to recommend high-fitness triplet mutants as candidates for wet-lab experiments. We first utilize individual fitness and digenic fitness scores as features and train machine learning models that produce a ranked list, from high to low fitness s cores, f or triplet gene mutants of S. cerevisiae. Then, we incorporate prior metabolic knowledge from an existing gene ontology, by designing a novel graph representation and deducing features that can capture gene similarity and gene interactions. Experimental results show that our proposed gene ontology enriched model, termed TriGORank, improves both performance and explainability.
Sahiti Labhishetty, Ismini Lourentzou, Michael Jeffrey Volk, Shekhar Mishra, Huimin Zhao, Chengxiang Zhai: TriGORank: A Gene Ontology Enriched Learning-to-Rank Framework for Trigenic Fitness Prediction. BIBM 2021: 1841-1848
- Date of publication:
- January 14, 2022
- IEEE International Conference on Bioinformatics and Biomedicine