Outlier detection has been an active area of research for a few decades. We propose a new definition of outlier that is useful for high-dimensional data. According to this definition, given a dictionary of atoms learned using the sparse coding objective, the outlierness of a data point depends jointly on two factors: the frequency of each atom in reconstructing all data points (or its negative log activity ratio, NLAR) and the strength by which it is used in reconstructing the current point. A Rarity based Outlier Detection algorithm in a Sparse coding framework (RODS) that consists of two components, NLAR learning and outlier scoring, is developed. This algorithm is unsupervised; both the offline and online variants are presented. It is governed by very few manually-tunable parameters and operates in linear time. We demonstrate the superior performance of the RODS in comparison with various state-of-the-art outlier detection algorithms on several benchmark datasets. We also demonstrate its effectiveness using three real-world case studies: saliency detection in images, abnormal event detection in videos, and change detection in data streams. Our evaluations shows that RODS outperforms competing algorithms reported in the outlier detection, saliency detection, video event detection, and change detection literature.
- Date of publication:
- September 2, 2016
- IEEE Transactions on Knowledge and Data Engineering
- Page number(s):
- Issue Number: