Sanghani Center Student Spotlight: Amarachi Blessing Mbakwe

Graphic is from the paper “CheXRelNet: An Anatomy-Aware Model for Tracking Longitudinal Relationships between Chest X-Rays”

In her research at the Sanghani Center, Ph.D. student Amarachi Blessing Mbakwe is trying to develop advanced artificial intelligence methodologies for better medical imaging and clinical decision-making.

Her passionate drive to improve healthcare systems that could save millions of lives worldwide stems from personal experience. With the deaths of two close family members in her home region in Nigeria, Mbakwe witnessed firsthand the devastating consequences of delayed disease detection, poor treatment management, and a shortage of healthcare professionals. 

Targeted intervention can improve healthcare access for everyone and mitigate the disparities in clinical care often faced by underrepresented populations and minorities, said Mbakwe, who is advised by Ismini Lourentzou.

“By developing an AI algorithm that can accurately and quickly analyze chest x-rays, my research can help reduce the time and effort required for radiologists to interpret medical imaging tests which, in turn, can help ensure timely patient treatment or adjustment of treatments, especially in regions with a shortage of radiologists,” she said.

Mbakwe has published papers and articles in various journals and conferences. She presented a collaborative paper, “CheXRelNet: An Anatomy-Aware Model for Tracking Longitudinal Relationships between Chest X-Rays,” at the 2022 Medical Image Computing and Computer Assisted Intervention Society conference in Singapore and, this spring, at the Computing Research Association  2023 CRA-WP Grad Cohort Workshop for IDEALS in Hawaii and the 2023 Grad Cohort Workshop for Women.

CheXRelNet incorporates local and global visual features, utilizes inter-image and intra-image anatomical information, and learns dependencies between anatomical region attributes via graph attention to accurately predict disease progression for a pair of chest x-rays.

“I was attracted to Virginia Tech’s Department of Computer Science and the Sanghani Center because I wanted to conduct impactful research that benefits society and they provided me with the perfect platform to achieve my goals,” Mbakwe.

She said that the outcome of her research is not only applicable in healthcare but could also extend further to other applications in fairness and finance. Last summer she had the opportunity to intern at JPMorgan Chase & Co as an AI research associate and will be returning for a second internship this summer.

Mbakwe earned a bachelor’s degree in mathematics from Nnamdi Azikiwe University, Anambra State, Nigeria, and a master’s degree in computer science and quantitative methods from Austin Peay State University in Clarksville, Tennessee.

Projected to graduate in 2024, she aspires to become a researcher in an industrial research lab and eventually also assume the position of visiting/adjunct professor.


Makanjuola Ogunleye among eight students nationwide to receive Cadence Black Students in Technology Scholarship

Makanjuola Ogunleye is a Ph.D. student in computer science at the Sanghani Center. Photo by Peter Means for Virginia Tech.

Makanjuola Ogunleye, a Ph.D. student in computer science at the Sanghani Center for Artificial Intelligence and Data Analytics, has been awarded a Black Students in Technology Scholarship from Cadence Diversity in Technology Scholarship Programs.

Ogunleye, a member of the Perception and LANguage (PLAN) research lab, is one of eight students pursuing technical degrees at universities across the country who were selected to receive the scholarship based on their impressive academic records, work in the community, leadership potential, and recommendations from professors. He is advised by Ismini Lourentzou, an assistant professor in the Department of Computer Science.  Read full story here.


For chatbots and beyond: Improving lives with data starts with improving machine learning

Ruoxi Jia. Photo by Chelsea Seeber for Virginia Tech.

Assistant Professor Ruoxi Jia in the Bradley Department of Electrical and Computer Engineering and core faculty at the Sanghani Center for Artificial Intelligence and Data Analyitics at Virginia Tech has received an National Science Foundation (NSF) Faculty Early Career Development (CAREER) award to investigate fundamental theories and computational tools needed to measure the value of data. Read full story here.


Sanghani Center Student Spotlight: Shengzhe Xu

Graphic is from the paper “STAN: Synthetic Network Traffic Generation with Generative Neural Models”

Shengzhe Xu chose to pursue a Ph.D. in computer science at Virginia Tech because the Sanghani Center offered him the opportunity to investigate cutting-edge challenges of academic importance and find ways of applying these methodologies to tackle real-world problems.

“What I like best about the center is that everyone is encouraged to pursue their own areas of interest,” said Xu, who is advised by the center’s director, Naren Ramakrishnan. “As students in this free scientific research environment, we just need to concentrate on improving ourselves and conduct in-depth research on the topics we choose.” 

Xu’s work explores semantic analysis of tabular data as well as synthetic tabular data generation. “A real-world example of this is network traffic data,” he said. “Every operation on the Internet is recorded like a footprint that we can model by using deep learning methods.”

But capturing the semantics of tabular data is a challenging problem. Unlike traditional natural language processing and computer vision fields, the overall portrait of tabular data is difficult for humans — even if they are domain experts — to judge because it has complex dependencies that need to explored in depth.

“Deep learning models have achieved great success in recent years but progress in some domains like cybersecurity is stymied due to a paucity of realistic datasets. For privacy reasons, organizations are reluctant to share such data, even internally,” he said. “In order to protect the privacy of training data from being leaked, it is important to explore how to generate good enough tabular data in terms of both training performance and privacy protection.”

Xu presented his work on “STAN: Synthetic Network Traffic Generation with Generative Neural Models” at the MLHat Workshop on Deployable Machine Learning for Security Defense during the 2021 SIGKDD Conference on Knowledge Discovery and Data Mining. The paper explored synthetic data generation in real-world network traffic flow data to protect any sensitive data from data leakage. 

Projected to graduate in 2024, Xu hopes to continue his research as an industry professional.


Sanghani Center Student Spotlight: Afrina Tabassum

Graphic is from the paper “Hard Negative Sampling Strategies for Contrastive Representation Learning”

Afrina Tabassum, a Ph.D. student in computer science, was attracted to the Sanghani Center by the trending research conducted by faculty for improving machine learning algorithms and their application to other fields.

Her research interests lie in machine learning and self-supervised learning, particularly designing novel representation learning objectives for multi-modal data. “I was really attracted to this area of research by an urge to use deep learning in order to make people’s lives easier,” she said.

One of the projects Tabassum is working on at the Sanghani Center is “Hard Negative Sampling Strategies for Contrastive Representation Learning,” a collaboration with her advisors, Hoda Eldardiry and Ismini Lourentzou and a fellow Ph.D. student.

Their paper introduces Uncertainty and Representativeness Mixing (UnReMix) for contrastive training, a method that combines importance scores that capture model uncertainty, representativeness, and anchor similarity. 

“We verify our method on several visual, text and graph benchmark datasets and perform comparisons over strong contrastive baselines,” said Tabassum, “and to the best of our knowledge, we are the first to consider representativeness for hard negative sampling in contrastive learning in a computationally inexpensive way.”

Experimental and qualitative results so far have demonstrated the effectiveness of their proposed approach, she said.

Tabassum is also part of a team from Lourentzou’s PLAN Lab which is competing in the Alexa Prize Taskbot Challenge 2.

“Ten teams across the world were selected to build a taskbot to assist in cooking and performing other tasks around the house. Our bot will be able to make adaptable conversation a reality by allowing customers to follow personalized decisions through the completion of multiple sequential subtasks and adapt to the tools, materials, or ingredients available to the user by proposing appropriate substitutes and alternatives,” she said.

In addition to working on adapting instructions according to the user needs, she is serving as student team leader with responsibilities that include setting clear team goals and short-term deadlines and delegating tasks among all the team members. 

Projected to graduate in 2024, Tabassum would like to pursue a career in industry research.


Dawei Zhou receives Cisco Faculty Research Award to help combat destructive insider threats to cybersecurity

Dawei Zhou

Insider threats to cybersecurity can occur when an actor with authorized access to an organization’s network conducts malicious activities that may release the organization’s critical information that further results in severe consequences such as financial loss, system crashes, and national security challenges.

“These threats are on the rise and according to a recent cyber security survey, 27 percent of cybercrime incidents involved insiders,” said Dawei Zhou, an assistant professor in the Department of Computer Science; director of the VirginiaTech Learning on Graphs (VLOG) Lab and core faculty at the Sanghani Center for Artificial Intelligence and Data Analytics.

One of Zhou’s projects, “Combating Insider Threat: Identification, Monitoring, and Data Augmentation,” targets the challenging problem of how to combat insider threats. He recently received a 2023-2024 Cisco Faculty Research Award that will help support this research.

Zhou said his project uses multiple dynamic and heterogeneous data sources that include internal system logs, employee networks, and email exchange networks.

“Distinctly from other types of terror attacks, insider threats exhibit several unique challenges like  rarity, non-separability, label scarcity, dynamicity, and heterogeneity, making it extremely difficult to catch them in time for a successful counter-attack,” said Zhou. 

He explains: Rarity means that the absolute number of such insiders is extremely small, especially compared with the total number of employees in a large organization or company; non-separability means that the insiders are very good at camouflaging themselves to make them indistinguishable from normal ones and thus able bypass the detection system; label scarcity means that the annotation process of insiders is labor-extensive and time-consuming; dynamicity refers to the time-evolving nature of the raw input data sources as well as the behaviors of insiders; and heterogeneity refers to the heterogeneous data coming from various sources and in various formats.  

“Although different insiders are often conscious and good at camouflaging themselves, they might share some common traits if examined under the proper lens” he said.

With this in mind, the project will try to combat insider threat via an interactive learning mechanism, building new theories and algorithms for the following learning tasks: 

  • Insider Identification: characterize the descriptive and essential properties of insiders and detect groups of insiders – such as traitors, masqueraders, and unintentional perpetrators — with common traits.

  • Insider Monitoring: track the evolution of insider behaviors over time and provide a visual system for analysis, annotation, and diagnosis.

  • Data Augmentation; sanitize input data by completing missing data and cleaning noisy data and generate synthetic insiders to alleviate the label scarcity issue. 

Computer science Ph.D. students Shuaicheng Zhang and Haohui Wang, who are advised by Zhou, will be working with him on the project. A third student, Weije Guan, will be joining the team in the Fall semester.

“We hope that the innovative approach we are taking will result in a better understanding of how to counterattack these threats and ultimately decrease the number of cybercrimes,” Zhou said. 


Virginia Tech researchers receive National Science Foundation award to secure vegetable production in a changing environment

The research team is developing climate-smart, economically efficient, and environmentally sustainable precision agricultural practices that enable more effective and adaptive decision-making as part of our nation’s agricultural priorities. Photo courtesy of USDA.

Virginia Tech researchers in the Center for Advanced Innovation in Agriculture (CAIA) and the Virginia Tech Applied Research Corporation(VT-ARC) were awarded a $750,000 grant by the National Science Foundation Convergence Accelerator program to enhance vegetable production and food security in the commonwealth.

The Sanghani Center for Artificial Intelligence and Data Analytics is a partner on this project. Read full story here.


Lenwood Heath collaborating on plant genome research project funded by National Science Foundation grant

Lenwood Heath

Lenwood Heath, a professor in the Department of Computer Science and core faculty at the Sanghani Center, is part of a team that recently received a National Science Foundation (NSF) grant for its plant genome research project, “Unraveling the origin of vegetative desiccation tolerance in vascular plants collaborators.” Heath is collaborating with colleagues from Texas Tech University and the University of Nevada, Reno on the study.

Excessive water loss is lethal for most plants, but a minority of plants (known as resurrection plants) have a remarkable ability to survive almost complete dryness, said Heath. This ability, known as desiccation tolerance, relies upon a combination of physiological, biochemical, and molecular responses that allow the plant to preserve cell integrity in the dry state.

“In the context of climate change,” Heath said, “we feel it is important to understand how plants respond to drying out and especially important to develop the science that will allow crops to better tolerate drought.”

“It is believed that this resurrection capability depends on genes that are in all plants but lost by most over evolutionary times,” Heath said. “The aim of our project is to discover the essential differences in genetic responses between resurrection plants and drought-sensitive plants so that crops can be re-engineered to be more drought tolerant.” 

In addition to sophisticated biological experiments to measure gene response in the two kinds of plants, the project will employ machine learning techniques, led by Heath, to construct gene regulatory networks (GRNs) for comparative study.  

The grant will provide learning and professional opportunities to graduate students and postdocs at the three universities. Jingyi Zhang, a Ph.D. computer science student advised by Heath, will work with him on the project.

Long-term goals for the project include promoting conservation programs for resurrection species; providing diverse scientific workforce training and outreach activities to first-generation students and the general public; and increasing public awareness about the importance of vegetative desiccation tolerance to future crop breeding in order to tackle the effects of climate change. 


Sanghani Center Student Spotlight: Raquib Bin Yousuf


Graphic is from the paper “Lessons from Deep Learning applied to Scholarly Information Extraction: What Works, What Doesn’t, and Future Directions”

Raquib Bin Yousuf, a Ph.D. student in computer science, is exploring the capabilities of large language models to generate text from different forms of data, especially from knowledge graphs. 

A knowledge graph, he said, can be a network with various entities and their relationships on any domain. Generating the correct and helpful narrative from the knowledge graphs is an important task for the user of that domain. 

“Although my research focus is on natural language processing, I have been fortunate while at the Sanghani Center to work in some other multidisciplinary domains as well,” he said. “The excellent and diverse work of the faculty is what attracted me to the center and the exposure I have had to real-world problems in these collaborative projects has helped me to learn more and conduct better research.”

Yousuf’s first exposure to his research area was through information retrieval projects from large scale text data during his undergraduate years. 

He has also worked on knowledge extraction projects under supervision of his advisor Naren Ramakrishnan, which have involved the application of natural language processing methods on large scale scholarly articles. 

“Recently there has been a pivotal innovation in NLP in the form of the Transformer model and subsequent development of large language models,” Yousuf said. “Today’s large language models can work well, across many tasks, with little to no help at all and that has motivated me to look deep into the working nature of these state of art models for real-world applications.” 

At the 2022 SIGKDD Conference on Knowledge Discovery and Data Mining last August in Washington, D.C., he presented “Lessons from Deep Learning applied to Scholarly Information Extraction: What Works, What Doesn’t, and Future Directions.” The paper explored the use of domain adapted Transformers models as building blocks to develop and deploy an automated End-to-end Research Entity Extractor, capable of extracting technical facets from full-text scholarly research articles of a large scale dataset.

Yousuf received a bachelor’s degree in computer science and engineering from Bangladesh University of Engineering and Technology (BUET) and a master’s degree in computer science from Virginia Tech.

Projected to graduate in 2025, he hopes to continue his research as an industry professional.

 


Danfeng ‘Daphne’ Yao, pioneer and expert in enterprise data security, elevated to IEEE fellow

Danfeng “Daphne” Yao

Danfeng “Daphne” Yao, professor in the Department of Computer Science and affiliate faculty at the Sanghani Center for Artificial Intelligence and Data Analytics at Virginia Tech, has been elevated to fellow, the highest grade of membership in the Institute of Electrical and Electronics Engineers (IEEE), for her contributions to enterprise data security and high-precision vulnerability screening. 

Following a rigorous evaluation procedure, fewer than 0.1 percent of voting members in the institute are selected annually for this career milestone. Read more here.