News featuring Patrick Butler

Sanghani Center leads collaborative study to improve both discovery and traceability of illegally-sourced timber

Reference sample collections from World Forest ID

Virginia Tech has received funding from the National Science Foundation for a collaborative research project that brings machine learning and data science research to the domain of Stable Isotope Ratio Analysis (SIRA) to improve discovery and traceability of illicitly-sourced timber products. Illegal timber trade (ITT) is the most profitable natural-resource crime, valued at 50-152 billion U.S. dollars per year.

Naren Ramakrishnan, the Thomas L. Phillips Professor of Engineering and director of the Sanghani Center for Artificial Intelligence and Data Analytics, is serving as principal investigator for the project with the University of Washington, World Forest ID, and Simeone Consulting, LLC.

“To enforce timber regulations and international frameworks, there is a need for accurate, cost-effective, and high-throughput tools that can be used to identify and trace illegally sourced timber products,” Ramakrishnan said. 

The team brings together data scientists, analytical chemists, geospatial and remote sensing scientists, practitioners, international trade and supply chain specialists, and field experts who conduct reference sample expeditions to bring novel data science approaches to analyzing a range of geospatial and remotely sensed datasets.

Patrick Butler, senior research associate, and Brian Mayer, research associate at the Sanghani Center will be part of the Virginia Tech team.

Key foci of this project include machine learning methods for SIRA analytics; location determination from isotopic ratios; and active sampling strategies to close the loop. Foundational machine learning contributions in science-guided machine learning, contrastive and generative learning paradigms, and active sampling algorithms will support not only the specific domain of SIRA but other adjacent domains in environmental conservation, agricultural forecasting, and smart farm modeling. 

“For example, what we learn from our research could be directly applicable to tracing many other illicitly-sourced products and product inputs, including forest risk commodities such as cocoa, soy, and beef,” said L. Monika Moskal, professor at the University of Washington.

The study will have broad and far-reaching impacts on American security and prosperity, as well. 

“Many key U.S. adversaries rely on illegal logging to finance their activities,” said Jade Saunders, executive director at World Forest ID. “Detecting and curbing such activities will moderate sources of regional instability and threats to U.S. interests.”

The project will lead to improving geospatial prediction accuracy of product origin and will enable a cost-benefit analysis to minimize future data collection costs and optimize prediction gain. Finally, this project will also positively affect U.S. economic competitiveness by reducing competition with illicit actors and moderating risks to international trade, Ramakrishnan said.


Research award aims to develop new algorithms for information extraction and understanding from scholarly literature

Naren Ramakrishnan, Director of DAC and Professor in the Department of Computer Science

The Discovery Analytics Center has received a research award from the Center for Security and Emerging Technology (CSET) at Georgetown University to support data-informed analysis for policymakers  concerning emerging technologies and their security implications. DAC will develop methods to extract novel insights at scale from full-text analytics of publications to better understand emerging technologies and their prevalence, spatial and temporal trends, and relationships.

“Algorithmic components developed by DAC will go into a high-performance pipeline that enables inspection of extracted patterns as well as the lineage of data transformations underlying the patterns,” said Naren Ramakrishnan, the Thomas L. Phillips Professor of Engineering and DAC director, who is the principal investigator for the project.

Ramakrishnan’s team at DAC — which includes senior research associate Patrick Butler; research associate Brian Mayer; and three Ph.D. students — will develop a machine learning framework based on weak supervision to process full-text AI publications into extracted structured fields, such as information on computational platforms utilized, language and library dependencies, compute time, research methods, objective tasks, and links to source code and data resources.

The initial focus will be on arXiv as researchers evaluate and assess progress followed by extraction from China National Knowledge Infrastructure (CNKI) literature, which provides full-text articles from more than 8,000 Chinese journals covering natural sciences, engineering, technology, agriculture, medicine, and selected topics in economics and social sciences.

This project is providing DAC with the opportunity to build on its prior work in extracting information from news articles about civil unrest events.  It will also be informed by DAC’s experience with automated extraction of epidemiological line lists from disease reports, which is used to develop custom word embeddings aimed at recognizing the typical language patterns in how computational details are described in the scholarly literature.

“This project brings together machine learning, computational linguistics, and human-computer interaction capabilities to extract features at scale. The information we extract will be mapped over time to help identify key trends and potential gaps that can support analysts and policy makers at the CSET,” said Ramakrishnan.

“We are looking forward to seeing how this innovative work can help inform CSET’s analysis as we strive to inform the future of AI policy,” said Dewey Murdick, director of Data Science at CSET.

 

 

 


DAC and BI lead DARPA’s Next Generation Social Science Project

brian & Chris

Brian Goode (left), from the Discovery Analytics Center, and Chris Kuhlman, from the Biocomplexity Institute at Virginia Tech, collaborate on developing models for large-scale social behavior.

DAC and the Biocomplexity Institute are leading a $3 million grant awarded by the Defense Advanced Research Projects Agency (DARPA) as part of the Next Generation Social Science (NGS2) program.  DAC and BI will conduct research that will streamline modeling processes, experimental design, and methodology in the social sciences. A major objective of the project is to make social science experiments rigorous, reproducible, and scalable to large populations.


DAC Director Naren Ramakrishnan named Inventor of the Month

dac-people-2

Members of the staff of the Discovery Analytics Center. Left to right are Nathan Self, Patrick Butler, and Naren Ramakrishnan.

DAC and director, Naren Ramakrishnan, are featured as this month’s Virginia Tech​ Inventors of the Month by the Office of Research and Innovation for work in Early Model Based Event Recognition using Surrogates (EMBERS) software project.

EMBERS is a fully automated system for forecasting significant societal events, such as influenza-like illness case counts, rare disease outbreaks, civil unrest, domestic political crises, and elections, from open source surrogates. To read more about EMBERS click here.