Xiaolong Li

Abstract

We propose a point-based spatiotemporal pyramid architecture, called PointMotionNet, to learn motion information from a sequence of large-scale 3D LiDAR point clouds. A core component of PointMotionNet is a novel technique for point-based spatiotemporal convolution, which finds the point correspondences across time by leveraging a time-invariant spatial neighboring space and extracts spatiotemporal features. To validate PointMotionNet, we consider two motion-related tasks: point-based motion prediction and multisweep semantic segmentation. For each task, we design an end-to-end system where PointMotionNet is the core module that learns motion information. We conduct extensive experiments and show that i) for point-based motion prediction, PointMotionNet achieves less than 0.5m mean squared error on Argoverse dataset, which is a significant improvement over existing methods; and ii) for multisweep semantic segmentation, PointMotionNet with a pretrained segmentation backbone outperforms previous SOTA by over 3.3 % mIoU on SemanticKITTI dataset with 25 classes including 6 moving objects.

Jun Wang, Xiaolong Li, Alan Sullivan, A. Lynn Abbott, Siheng Chen: PointMotionNet: Point-Wise Motion Learning for Large-Scale LiDAR Point Clouds Sequences. CVPR Workshops 2022: 4418-4427

People

Xiaolong Li


Publication Details

Date of publication:
Conference:
IEEE Conference on Computer Vision and Pattern Recognition
Page number(s):
4418-4427