Virginia Tech® home

MaskRNN: Instance Level Video Object Segmentation

Jia-Bin Huang

Abstract

Instance level video object segmentation is an important technique for video editing and compression. To capture the temporal coherence, in this paper, we develop MaskRNN, a recurrent neural net approach which fuses in each frame the output of two deep nets for each object instance -- a binary segmentation net providing a mask and a localization net providing a bounding box. Due to the recurrent component and the localization component, our method is able to take advantage of long-term temporal structures of the video data as well as rejecting outliers. We validate the proposed algorithm on three challenging benchmark datasets, the DAVIS-2016 dataset, the DAVIS-2017 dataset, and the Segtrack v2 dataset, achieving state-of-the-art performance on all of them.

People

Publication Details

Date of publication: March 28, 2018

Journal: arXiv

Page number(s):

Volume:

Issue Number:

Publication Note: Yuan-Ting Hu, Jia-Bin Huang, Alexander G. Schwing: MaskRNN: Instance Level Video Object Segmentation. CoRR abs/1803.11187 (2018)