|Description (include details on usage, files and paper references)||CMU/VMR Urban Image+Laser dataset contains 372 images linked with 3D laser points projections. There are additional images (due to the laser scanner being turned off) which are not used.
The dataset is used for co-training and co-evaluation of missing data between sparse 3D points and dense 2D pixels. The evaluation is performed on 5 different partitions (297-train/75-test) of the data, grouped by time to avoid training on overlaps between training data.
Per file, there exists a 3D point cloud referenced to an image cols/rows and its corresponding RGB and GT label image. The label space is defined over 29 semantic categories that typically occur in urban environments (road, sidewalk, ground, building, barrier, bus-stop, stairs, shrub, tree-trunk, tree-top, small-vehicle, big-vehicle, person, tall-light, post, sign, utility-pole, wire, traffic-signal).
Co-inference for Multi-modal Scene Analysis
D. Munoz, J. A. Bagnell, M. Hebert, ECCV 2012