In the era of data explosion, noise and corruption in real-world data caused by accidental outliers, transmission loss, or even adversarial data attacks is inevitable and often results in incorrect data labeling. For example, a negative review in the Internet Movie Database (IMDb) could be mislabeled as positive or an image of a panda might be mislabeled as a gibbon.
Xuchao Zhang, a Ph.D. student in computer science, is focused on solving the problem of mislabeling.
“Using scalable robust model learning, we propose distributed and online robust algorithms to handle regression and classification problems in the presence of adversarial data corruption,” said Zhang, who is advised by Chang-Tien (C.T.) Lu in the National Capital Region.
Zang said his research can be broadly applied to noisy datasets in massive real-world applications.
Zhang, who earned a bachelor’s degree at Shanghai Jiao Tong University in China, begin his Ph.D. studies in 2009.
“I chose Virginia Tech’s engineering school for its abundance of advanced research resources and outstanding faculty in the field of data mining and machine learning,” Zhang said. “I am very fortunate to work with Dr. Lu as a DAC student.”
He collaborated with Lu and other researchers from Virginia Tech and George Mason University on the study, “Online and Distributed Robust Regressions under Adversarial Data Corruption,” which he presented at the 2017 IEEE International Conference on Data Mining (ICDM) in New Orleans, LA, in November.
His research has also been presented at other conferences, including the ACM International Conference on Information and Knowledge Management (CIKM); the IEEE International Conference on Big Data, and the International Joint Conference on Artificial Intelligence (IJCAI).
Zhang serves on the program committee (research track) for the Association of Computing Machinery’s Special Interest Group on Knowledge of Discovery and Data Mining (KDD) and will be attending the 2018 conference in London.
This summer, Zhang heads to Redmond, Washington, where he has an internship at Microsoft Research AI.