This website provides a list of frequently used computer vision datasets. Wait, there is more!
There is also a description containing common problems, pitfalls and characteristics and now a searchable TAG cloud.
Plus, this is open for crowd editing (if you pass the ultimate turing test)! - Questions? yacvid [at] hayko [dot] at
Content, Design and Idea © by Hayko Riemenschneider, 2011-2018. Texts and Images are subject of copyright by the respective authors.
Hey! If you're reading this, why not help and update the description of the dataset you're working on? Add a new dataset! Yay!
«showing 697 tags of 697 total tags for 516 datasets (1.35) »
|478||UE4Sim and Sim4CV||Sim4CV is the general environment for simulating data for computer vision tasks, like object tracking, pose estimation, detection, action recognition, indoor sc...||synthetic object tracking, pose estimation, detection, action recognition, indoor scene understanding, multi-agent collaboration, autonomous navigation, 3d reconstruction, crowd understanding, urban scene understanding, human tracking, aerial surveying. simulation environment 3d photo-realistic realism depth segmentation urban rgb render||link||2019-03-25||1246|
|477||House3D: A Rich and Realistic 3D Environment||House3D is a virtual 3D environment which consists of thousands of indoor scenes equipped with a diverse set of scene types, layouts and objects sourced from th...||house indoor simulation environment 3d photo-realistic realism depth segmentation urban rgb render||link||2018-11-30||707|
|462||Taskonomy||The Taskonomy dataset consists of 3.9 Mil. Scenes, 600 Buildings, 25 Tags per Image, 1024 Resolution for taxonomy and transfer learning tasks. We provide a larg...||transfer learning taxonomy task deep indoor 3d mesh pose camera high-resolution||link||2018-08-08||472|
|454||SBM-RGBD Dataset||The SBM-RGBD dataset [provides] all facilities (data, ground truths, and evaluation scripts) in order to evaluate and compare scene background modelling metho...||background modeling rgbd kinect video color depth benchmark indoor surveillance||link||2019-08-12||915|
|411||ISR-UoL 3D Social Activity Dataset||This is a social interaction dataset between two subjects. This dataset consists of RGB and depth images and tracked skeleton data (i.e. joints 3D coordinates a...||Social, Activity, Interaction, Human, Indoor, Skeleton, RGBD, ROS action||link||2017-11-28||697|
|394||Matterport 2D-3D-Semantics Data||The 2D-3D-S dataset provides a variety of mutually registered modalities from 2D, 2.5D and 3D domains, with instance-level semantic and geometric annotations. I...||3d panorama semantic segmentation depth normal indoor building reconstruction large-scale||link||2017-07-27||1266|
|376||ScanNet||ScanNet is an RGB-D video dataset containing 2.5 million views in more than 1500 scans, annotated with 3D camera poses, surface reconstructions, and instance-le...||scene indoor synthetic cad room layout rendering realism 3d segmentation object recognition||link||2017-05-12||967|
|375||SUNCG: Indoor Scenes||The SUNCG dataset is a Large 3D Model Repository for Indoor Scenes. SUNCG is an ongoing effort to establish a richly-annotated, large-scale dataset of 3D s...||scene indoor synthetic room layout rendering realism 3d segmentation object recognition||link||2020-06-01||2176|
|374||SceneNet RGB-D Synthetic Indoor||SceneNet RGB-D is dataset comprised of 5 million Photorealistic Images of Synthetic Indoor Trajectories with Ground Truth. It expands the previous work of Scene...||scene indoor synthetic robot navigation rendering 3d reconstruction trajectory lighting segmentation slam||link||2017-05-02||974|
|366||Multi-Camera Action Dataset||An indoor action recognition dataset which consists of 18 classes performed by 20 individuals. Each action is individually performed for 8 times (4 daytime and ...||indoor video Multi-Camera Action Recognition Cross-View Recognition Open-View Recognition||link||2017-09-12||879|
|355||IMPART multi-modal/multi-view||The multi-modal/multi-view datasets are created in a cooperation between University of Surrey and Double Negative within the EU FP7 IMPART project. The sourc...||multiview multi-mode video rgbd lidar 3d model color indoor outdoor dynamic action face human emotion||link||2019-03-16||1009|
|331||EuRoC MAV Dataset||This web page presents visual-inertial datasets collected on-board a Micro Aerial Vehicle (MAV). The datasets contain stereo images, synchronized IMU measuremen...||aerial vehicle, indoor, global shutter, slam||link||2017-11-28||1892|
|327||PIROPO Database: People in Indoor ROoms with Perspective and Omnidirectional cameras||The PIROPO database (People in Indoor ROoms with Perspective and Omnidirectional cameras) comprises multiple sequences recorded in two different indoor rooms, u...||people surveillance perspective omnidirectional fisheye indoor room detection human||link||2017-02-16||1865|
|295||Rent3D||The Rent3D dataset comprises floorplans and images. The goal of this work is to enable a 3D virtual-tour of an apartment given a small set of monocular images o...||indoor building reconstruction layout floorplan apartment urban||link||2015-07-13||1317|
|286||HDA Person Dataset - ISR Lisbon||The High Definition Analytics (HDA) dataset is a multi-camera High-Resolution image sequence dataset for research on High-Definition surveillance: Pedestrian De...||Video Surveillance Pedestrian Detection Re-Identification Multiview Tracking Benchmark Indoor High-Definition Camera Network lisbon human||link||2020-03-17||4093|
|271||Labeling in 3D Scenes||This dataset package contains the software and data used for Detection-based Object Labeling on the RGB-D Scenes Dataset as implemented in the paper: Detecti...||3d kinect reconstruction indoor depth object recognition||link||2015-03-16||1347|
|270||B3DO: Berkeley 3D Object Dataset||For the first few decades of the fields existence, computer vision has been focused on algorithmic, logical approaches to perception. But it was only with the a...||3d kinect reconstruction indoor depth object recognition||link||2020-02-25||1276|
|181||All I Have Seen (AIHS)||The All I Have Seen (AIHS) dataset is created to study the properties of total visual input in humans, for around two weeks Nebojsa Jojic wore a camera capturin...||video summary user study clustering similarity outdoor indoor scene 3d||link||2018-09-19||1386|
|168||Mall Dataset||The Mall dataset was collected from a publicly accessible webcam for crowd counting and profiling research. Ground truth: Over 60,000 pedestrians were label...||detection tracking crowd counting pedestrian indoor video webcam||link||2020-04-28||3438|
|166||ICG Multi-Camera Datasets||The ICG Multi-Camera datasets consist of Easy Data Set (just one person) Medium Data Set (3-5 persons, used for the experiments) Hard Data Set (crowded sc...||multiview pedestrian tracking detection object camera calibration graz indoor video multitarget||link||2015-06-19||2228|
|163||TUGRAZ ICG Longterm Pedestrian Dataset||The Longterm Pedestrian dataset consists of images from a stationary camera running 24 hours for 7 days at about 1 fps. It used for adaptive detection and back...||pedestrian change detection background illumination robust indoor coffee graz multitarget||link||2020-05-04||1740|
|104||Make3D Depth||The Make3D Depth dataset s designed to learn features to estimate scene depth from a single image. This dataset contains aligned image and range data: Make3...||depth, learning, single view, outdoor, indoor||link||2019-04-03||2406|
|15||PETS 2006||The PETS 2006 dataset contains 7 parts showing multi-sensor sequences containing left-luggage scenarios with increasing scene complexity at a train station scen...||frontview, indoor, pedestrian, detection, tracking, multitarget||link||2019-05-29||2296|