Yet Another Computer Vision Index To Datasets (YACVID) - Details

Stand: 2018-10-19 000000m 22:28:55 - Overview

ok244
Attribute Current Content New
Name (Institute + Shorttitle)ApolloScape Semantic Segmentation 
Description (include details on usage, files and paper references)The ApolloScape Parsing dataset is provided by Baidu for the CVPR 2018 Workshop on Autonomous Driving Challenge.
It is expected that the Scene Parsing dataset will include 200 thousand images with corresponding semantic annotation and depth information ( as of March 8, 2018,We release 80000+ images ) . Divided into training , validation and test sets of three parts, and provide a list of files. Training and validation sets are used to design algorithms and train models , with corresponding pixel annotations and depth information. Semantic annotation images of test set will be used for testing , they are not available for download.

Total number of RGB images: 200000

Total number of depth images: 200000

Total number of classes: 25

Total number of lane markings: 28

Image resolution: 3384 x 2710

GPS trajectories: Yes

Camera intrinsic and extrinsic parameters: Yes

Interval between two image frames: 1 meter

3 Data Example


点云示例


深度图

4 Dataset Download

We will release the dataset in stages. In this release, we release the labeled RGB images

road01_ins.tar.gzroad02_ins.tar.gzroad03_ins.tar.gzroad04_ins.tar.gz
Note: All photos can only be used for educational purpose by individuals or organizations. Commercial use or other violations of copyright law are not permitted.

5 Dataset Structure

Folder structure of the dataset

{root} / {type} / {road id} _ {level} / {record id} / {camera id} / {timestamp} _ {camera id} {ext}

root: the root folder defined by users.

type: there are three data types in current release, i.e., ColorImage, Label, and Pose.

road id: the road id, e.g., road001, road002.

level: two different levels, seg means labels contains pixel-level labels only, ins means labels contains both pixel-level and instance-level labels.

record id: the record is, e.g., Record001, Record002. Each record contains up to few thousands images.

camera id: two front cameras are used in our acquisition system, i.e., Camera 5 and Camera 6.

timestamp: the first part of the image name.

camera id: the second part of the image name.

ext: the extension of the file. .jpg for color image, _bin.png for label image, .json for the polygon list of instance-level labels, and _instanceIds.png for instance-level labels.

There is only one pose file (i.e., pose.txt) for each camera and each record. This pose file contains all the extrinsic parameters for all the images of the corresponding camera and record. The format of each line in the pose file is as follows:

r00 r01 r02 t0 r10 r11 r12 t1 r20 r21 r22 t2 0 0 0 1 image_name

The cameras have been well calibrated and undistorted. The intrinsic parameters of cameras are:

Camera 5:

fx=2304.54786556982

fy=2305.875668062

Cx=1686.23787612802

Cy=1354.98486439791

Camera 6:

fx=2300.39065314361

fy=2301.31478860597

Cx=1713.21615190657

Cy=1342.91100799715

 
URL Linkhttp://apolloscape.auto/scene.html 
Files (#)
References (SKIPPED)
Category (SKIPPED) 
Tags (single words, spaced)segmentation semantic scene benchmark size urban autonomous driving camera calibration 
Last Changed2018-10-19 
Turing (2.12+3.25=?) :-)