Lifu Huang receives NSF CAREER award to lay new ground for information extraction without relying on humans

Lifu Huang. Photo by Peter Means for Virginia Tech.

Considering the millions of research papers and reports from open domains such as biomedicine, agriculture, and manufacturing, it is humanly impossible to keep up with all the findings.

Constantly emerging world events present a similar challenge because they are difficult to track and even harder to analyze without looking into thousands of articles. 

To address the problem of relying on human effort in situations such as these, Lifu Huang, an assistant professor in the Department of Computer Science and core faculty at the Sanghani Center for Artificial Intelligence and Data Analytics, is researching how machine learning can extract information without relying on humans.  Read the full story here.


Researchers study the crowdsourced investigation of Jan. 6, 2021

Kurt Luther is an associate professor of computer science and history. Photo by Olivia Coleman for Virginia Tech.

How has online sleuthing successfully replaced wanted posters?

Researchers within the Virginia Tech Department of Computer Science answered this question by studying the crowdsourced online investigation that followed the Jan. 6, 2021, insurrection at the U.S. Capitol.

Tianjiao “Joey” Yu and Kurt Luther collaborated on the project with Ismini Lourentzou, assistant professor of computer science and a core faculty at the Sanghani Center for Artificial Intelligence and Data Analytics, and Sukrit Venkatagiri, a postdoctoral researcher at the University of Washington. Read the full story here.


Amazon-Virginia Tech Initiative showcases innovative approaches to robust and efficient machine learning

(From left) Reza Ghanadan, senior principal scientist, Amazon Alexa and the new Amazon center liaison for the Amazon-Virginia Tech initiative; Shehzad Mevawalla, vice president of Alexa Speech Recognition, Amazon Alexa; Virginia Tech President Tim Sands; Lance Collins, vice president and executive director, Innovation Campus; Julie Ross, the Paul and Dorothea Torgerson Dean of Engineering; Naren Ramakrishnan, the Thomas L. Phillips Professor of Engineering and director of the Amazon-Virginia Tech initiative; and Wanawsha Shalaby, program manager for the Amazon-Virginia Tech initiative. Photo by Lee Friesland for Virginia Tech.

Virginia Tech and Amazon gathered for a Machine Learning Day held at the Virginia Tech Research Center — Arlington on April 25 to celebrate and further solidify their collaborative Amazon–Virginia Tech Initiative for Efficient and Robust Machine Learning.  

Announced last year, the initiative — funded by Amazon, housed in the College of Engineering, and directed by researchers at the Sanghani Center for Artificial Intelligence and Data Analytics on Virginia Tech’s campus in Blacksburg and at the Innovation Campus in Alexandria — supports student- and faculty-led development and implementation of innovative approaches to robust machine learning, such as ensuring that algorithms and models are resistant to errors and adversaries, that could address worldwide industry-focused problems. Read full story here.


Sanghani Center Student Spotlight: Amarachi Blessing Mbakwe

Graphic is from the paper “CheXRelNet: An Anatomy-Aware Model for Tracking Longitudinal Relationships between Chest X-Rays”

In her research at the Sanghani Center, Ph.D. student Amarachi Blessing Mbakwe is trying to develop advanced artificial intelligence methodologies for better medical imaging and clinical decision-making.

Her passionate drive to improve healthcare systems that could save millions of lives worldwide stems from personal experience. With the deaths of two close family members in her home region in Nigeria, Mbakwe witnessed firsthand the devastating consequences of delayed disease detection, poor treatment management, and a shortage of healthcare professionals. 

Targeted intervention can improve healthcare access for everyone and mitigate the disparities in clinical care often faced by underrepresented populations and minorities, said Mbakwe, who is advised by Ismini Lourentzou.

“By developing an AI algorithm that can accurately and quickly analyze chest x-rays, my research can help reduce the time and effort required for radiologists to interpret medical imaging tests which, in turn, can help ensure timely patient treatment or adjustment of treatments, especially in regions with a shortage of radiologists,” she said.

Mbakwe has published papers and articles in various journals and conferences. She presented a collaborative paper, “CheXRelNet: An Anatomy-Aware Model for Tracking Longitudinal Relationships between Chest X-Rays,” at the 2022 Medical Image Computing and Computer Assisted Intervention Society conference in Singapore and, this spring, at the Computing Research Association  2023 CRA-WP Grad Cohort Workshop for IDEALS in Hawaii and the 2023 Grad Cohort Workshop for Women.

CheXRelNet incorporates local and global visual features, utilizes inter-image and intra-image anatomical information, and learns dependencies between anatomical region attributes via graph attention to accurately predict disease progression for a pair of chest x-rays.

“I was attracted to Virginia Tech’s Department of Computer Science and the Sanghani Center because I wanted to conduct impactful research that benefits society and they provided me with the perfect platform to achieve my goals,” Mbakwe.

She said that the outcome of her research is not only applicable in healthcare but could also extend further to other applications in fairness and finance. Last summer she had the opportunity to intern at JPMorgan Chase & Co as an AI research associate and will be returning for a second internship this summer.

Mbakwe earned a bachelor’s degree in mathematics from Nnamdi Azikiwe University, Anambra State, Nigeria, and a master’s degree in computer science and quantitative methods from Austin Peay State University in Clarksville, Tennessee.

Projected to graduate in 2024, she aspires to become a researcher in an industrial research lab and eventually also assume the position of visiting/adjunct professor.


Makanjuola Ogunleye among eight students nationwide to receive Cadence Black Students in Technology Scholarship

Makanjuola Ogunleye is a Ph.D. student in computer science at the Sanghani Center. Photo by Peter Means for Virginia Tech.

Makanjuola Ogunleye, a Ph.D. student in computer science at the Sanghani Center for Artificial Intelligence and Data Analytics, has been awarded a Black Students in Technology Scholarship from Cadence Diversity in Technology Scholarship Programs.

Ogunleye, a member of the Perception and LANguage (PLAN) research lab, is one of eight students pursuing technical degrees at universities across the country who were selected to receive the scholarship based on their impressive academic records, work in the community, leadership potential, and recommendations from professors. He is advised by Ismini Lourentzou, an assistant professor in the Department of Computer Science.  Read full story here.


For chatbots and beyond: Improving lives with data starts with improving machine learning

Ruoxi Jia. Photo by Chelsea Seeber for Virginia Tech.

Assistant Professor Ruoxi Jia in the Bradley Department of Electrical and Computer Engineering and core faculty at the Sanghani Center for Artificial Intelligence and Data Analyitics at Virginia Tech has received an National Science Foundation (NSF) Faculty Early Career Development (CAREER) award to investigate fundamental theories and computational tools needed to measure the value of data. Read full story here.


Sanghani Center Student Spotlight: Shengzhe Xu

Graphic is from the paper “STAN: Synthetic Network Traffic Generation with Generative Neural Models”

Shengzhe Xu chose to pursue a Ph.D. in computer science at Virginia Tech because the Sanghani Center offered him the opportunity to investigate cutting-edge challenges of academic importance and find ways of applying these methodologies to tackle real-world problems.

“What I like best about the center is that everyone is encouraged to pursue their own areas of interest,” said Xu, who is advised by the center’s director, Naren Ramakrishnan. “As students in this free scientific research environment, we just need to concentrate on improving ourselves and conduct in-depth research on the topics we choose.” 

Xu’s work explores semantic analysis of tabular data as well as synthetic tabular data generation. “A real-world example of this is network traffic data,” he said. “Every operation on the Internet is recorded like a footprint that we can model by using deep learning methods.”

But capturing the semantics of tabular data is a challenging problem. Unlike traditional natural language processing and computer vision fields, the overall portrait of tabular data is difficult for humans — even if they are domain experts — to judge because it has complex dependencies that need to explored in depth.

“Deep learning models have achieved great success in recent years but progress in some domains like cybersecurity is stymied due to a paucity of realistic datasets. For privacy reasons, organizations are reluctant to share such data, even internally,” he said. “In order to protect the privacy of training data from being leaked, it is important to explore how to generate good enough tabular data in terms of both training performance and privacy protection.”

Xu presented his work on “STAN: Synthetic Network Traffic Generation with Generative Neural Models” at the MLHat Workshop on Deployable Machine Learning for Security Defense during the 2021 SIGKDD Conference on Knowledge Discovery and Data Mining. The paper explored synthetic data generation in real-world network traffic flow data to protect any sensitive data from data leakage. 

Projected to graduate in 2024, Xu hopes to continue his research as an industry professional.


Sanghani Center Student Spotlight: Afrina Tabassum

Graphic is from the paper “Hard Negative Sampling Strategies for Contrastive Representation Learning”

Afrina Tabassum, a Ph.D. student in computer science, was attracted to the Sanghani Center by the trending research conducted by faculty for improving machine learning algorithms and their application to other fields.

Her research interests lie in machine learning and self-supervised learning, particularly designing novel representation learning objectives for multi-modal data. “I was really attracted to this area of research by an urge to use deep learning in order to make people’s lives easier,” she said.

One of the projects Tabassum is working on at the Sanghani Center is “Hard Negative Sampling Strategies for Contrastive Representation Learning,” a collaboration with her advisors, Hoda Eldardiry and Ismini Lourentzou and a fellow Ph.D. student.

Their paper introduces Uncertainty and Representativeness Mixing (UnReMix) for contrastive training, a method that combines importance scores that capture model uncertainty, representativeness, and anchor similarity. 

“We verify our method on several visual, text and graph benchmark datasets and perform comparisons over strong contrastive baselines,” said Tabassum, “and to the best of our knowledge, we are the first to consider representativeness for hard negative sampling in contrastive learning in a computationally inexpensive way.”

Experimental and qualitative results so far have demonstrated the effectiveness of their proposed approach, she said.

Tabassum is also part of a team from Lourentzou’s PLAN Lab which is competing in the Alexa Prize Taskbot Challenge 2.

“Ten teams across the world were selected to build a taskbot to assist in cooking and performing other tasks around the house. Our bot will be able to make adaptable conversation a reality by allowing customers to follow personalized decisions through the completion of multiple sequential subtasks and adapt to the tools, materials, or ingredients available to the user by proposing appropriate substitutes and alternatives,” she said.

In addition to working on adapting instructions according to the user needs, she is serving as student team leader with responsibilities that include setting clear team goals and short-term deadlines and delegating tasks among all the team members. 

Projected to graduate in 2024, Tabassum would like to pursue a career in industry research.


Dawei Zhou receives Cisco Faculty Research Award to help combat destructive insider threats to cybersecurity

Dawei Zhou

Insider threats to cybersecurity can occur when an actor with authorized access to an organization’s network conducts malicious activities that may release the organization’s critical information that further results in severe consequences such as financial loss, system crashes, and national security challenges.

“These threats are on the rise and according to a recent cyber security survey, 27 percent of cybercrime incidents involved insiders,” said Dawei Zhou, an assistant professor in the Department of Computer Science; director of the VirginiaTech Learning on Graphs (VLOG) Lab and core faculty at the Sanghani Center for Artificial Intelligence and Data Analytics.

One of Zhou’s projects, “Combating Insider Threat: Identification, Monitoring, and Data Augmentation,” targets the challenging problem of how to combat insider threats. He recently received a 2023-2024 Cisco Faculty Research Award that will help support this research.

Zhou said his project uses multiple dynamic and heterogeneous data sources that include internal system logs, employee networks, and email exchange networks.

“Distinctly from other types of terror attacks, insider threats exhibit several unique challenges like  rarity, non-separability, label scarcity, dynamicity, and heterogeneity, making it extremely difficult to catch them in time for a successful counter-attack,” said Zhou. 

He explains: Rarity means that the absolute number of such insiders is extremely small, especially compared with the total number of employees in a large organization or company; non-separability means that the insiders are very good at camouflaging themselves to make them indistinguishable from normal ones and thus able bypass the detection system; label scarcity means that the annotation process of insiders is labor-extensive and time-consuming; dynamicity refers to the time-evolving nature of the raw input data sources as well as the behaviors of insiders; and heterogeneity refers to the heterogeneous data coming from various sources and in various formats.  

“Although different insiders are often conscious and good at camouflaging themselves, they might share some common traits if examined under the proper lens” he said.

With this in mind, the project will try to combat insider threat via an interactive learning mechanism, building new theories and algorithms for the following learning tasks: 

  • Insider Identification: characterize the descriptive and essential properties of insiders and detect groups of insiders – such as traitors, masqueraders, and unintentional perpetrators — with common traits.

  • Insider Monitoring: track the evolution of insider behaviors over time and provide a visual system for analysis, annotation, and diagnosis.

  • Data Augmentation; sanitize input data by completing missing data and cleaning noisy data and generate synthetic insiders to alleviate the label scarcity issue. 

Computer science Ph.D. students Shuaicheng Zhang and Haohui Wang, who are advised by Zhou, will be working with him on the project. A third student, Weije Guan, will be joining the team in the Fall semester.

“We hope that the innovative approach we are taking will result in a better understanding of how to counterattack these threats and ultimately decrease the number of cybercrimes,” Zhou said. 


Virginia Tech researchers receive National Science Foundation award to secure vegetable production in a changing environment

The research team is developing climate-smart, economically efficient, and environmentally sustainable precision agricultural practices that enable more effective and adaptive decision-making as part of our nation’s agricultural priorities. Photo courtesy of USDA.

Virginia Tech researchers in the Center for Advanced Innovation in Agriculture (CAIA) and the Virginia Tech Applied Research Corporation(VT-ARC) were awarded a $750,000 grant by the National Science Foundation Convergence Accelerator program to enhance vegetable production and food security in the commonwealth.

The Sanghani Center for Artificial Intelligence and Data Analytics is a partner on this project. Read full story here.