Graphic is from the paper “Text-to-SQL Generation for Question Answering on Electronic Medical Records”

In 2016, Ping Wang followed her advisor, Chandan Reddy, from Wayne State University, where she received a master’s degree in computer science, to Virginia Tech and the Sanghani Center.

Her area of interest is healthcare systems, which are undergoing many changes in the era of big data.

“Advances in artificial intelligence and digitization in healthcare have enabled healthcare providers to effectively sift through tremendous amounts of medical information,” said Wang. “My first research project in this direction was about survival analysis and my advisor Dr. Reddy and other group members provided many useful suggestions and help at the initial stage. After further investigation, I found that there are still many unique challenges in the healthcare domain. I hope to leverage my expertise in data mining and machine learning to solve real-world challenges and advance healthcare applications.”

While earning her Ph.D., Wang has been located, at different time periods, in both Arlington and Blacksburg. She said she has enjoyed her experiences on both campuses, maintaining regular meetings with Reddy and other group members to discuss her research and its progress.

“The professional environment for learning and conducting research at the Sanghani Center has offered me great research and collaboration opportunities,” Wang said.

Her research is focused on developing machine learning methods that can efficiently utilize Electronic Health Records (EHRs). These records contain medical and treatment history of patients to facilitate physicians’ decision making in their clinical practice.

Wang is looking at three aspects: (1) Clinical Question Answering: How to seek answers from EHRs for clinical activity related questions posed in human language without the assistance of database and natural language processing (NLP) domain experts; (2) Survival Analysis: How to predict when a medical event will occur and estimate its probability based on prior medical history of patients; and (3) Knowledge Discovery: How to discover underlying relationships between different events and entities in structured tabular EHRs and apply NLP techniques to construct structured events and knowledge base from clinical notes.

One of the goals in clinical question answering is to develop machine learning methods that can automatically seek answers from relational tables of the EHR database for human-language questions, she said. Traditionally, doctors interact with EHR via searching and filtering functions available in rule-based systems that first turn predefined-rules (user interface) to SQL queries, which will be executed on the database to retrieve patient information.

“These systems are complicated, difficult to manage, and require special training,” Wang said. “To tackle this problem, we proposed building a Text-to-SQL Query Translation System that can automatically translate clinical activity related questions to SQL queries, so that the doctors only need to type their questions in a search box to get answers. I also created a MIMICSQL dataset for question answering on tabular EHR to simulate a more realistic setting.”

This work, “Text-to-SQL Generation for Question Answering on Electronic Medical Records” was published at The Web Conference 2020.

Most recently, Wang presented “Self-Supervised Learning of Contextual Embeddings for Link Prediction in Heterogeneous Networks” virtually at The Web Conference 2021.

Among her other published work is “Tensor-based Temporal Multi-Task Survival Analysis,” which was in the IEEE Transactions on Knowledge and Data Engineering in 2020.

Wang plans to defend her dissertation this summer and will join the Department of Computer Science at Stevens Institute of Technology as a tenure-track assistant professor for the Fall 2021 semester.