Description (include details on usage, files and paper references) | The ApolloScape Parsing dataset is provided by Baidu for the CVPR 2018 Workshop on Autonomous Driving Challenge.
It is expected that the Scene Parsing dataset will include 200 thousand images with corresponding semantic annotation and depth information ( as of March 8, 2018,We release 80000+ images ) . Divided into training , validation and test sets of three parts, and provide a list of files. Training and validation sets are used to design algorithms and train models , with corresponding pixel annotations and depth information. Semantic annotation images of test set will be used for testing , they are not available for download.
Total number of RGB images: 200000
Total number of depth images: 200000
Total number of classes: 25
Total number of lane markings: 28
Image resolution: 3384 x 2710
GPS trajectories: Yes
Camera intrinsic and extrinsic parameters: Yes
Interval between two image frames: 1 meter
3 · Data Example
点云示例
深度图
4 · Dataset Download
We will release the dataset in stages. In this release, we release the labeled RGB images
road01_ins.tar.gzroad02_ins.tar.gzroad03_ins.tar.gzroad04_ins.tar.gz
Note: All photos can only be used for educational purpose by individuals or organizations. Commercial use or other violations of copyright law are not permitted.
5 · Dataset Structure
Folder structure of the dataset
{root} / {type} / {road id} _ {level} / {record id} / {camera id} / {timestamp} _ {camera id} {ext}
root: the root folder defined by users.
type: there are three data types in current release, i.e., ColorImage, Label, and Pose.
road id: the road id, e.g., road001, road002.
level: two different levels, seg means labels contains pixel-level labels only, ins means labels contains both pixel-level and instance-level labels.
record id: the record is, e.g., Record001, Record002. Each record contains up to few thousands images.
camera id: two front cameras are used in our acquisition system, i.e., Camera 5 and Camera 6.
timestamp: the first part of the image name.
camera id: the second part of the image name.
ext: the extension of the file. .jpg for color image, _bin.png for label image, .json for the polygon list of instance-level labels, and _instanceIds.png for instance-level labels.
There is only one pose file (i.e., pose.txt) for each camera and each record. This pose file contains all the extrinsic parameters for all the images of the corresponding camera and record. The format of each line in the pose file is as follows:
r00 r01 r02 t0 r10 r11 r12 t1 r20 r21 r22 t2 0 0 0 1 image_name
The cameras have been well calibrated and undistorted. The intrinsic parameters of cameras are:
Camera 5:
fx=2304.54786556982
fy=2305.875668062
Cx=1686.23787612802
Cy=1354.98486439791
Camera 6:
fx=2300.39065314361
fy=2301.31478860597
Cx=1713.21615190657
Cy=1342.91100799715
|
|