This website provides a list of frequently used computer vision datasets. Wait, there is more!
There is also a description containing common problems, pitfalls and characteristics and now a searchable TAG cloud.
Plus, this is open for crowd editing (if you pass the ultimate turing test)! - Questions? yacvid [at] hayko [dot] at
Content, Design and Idea © by Hayko Riemenschneider, 2011-2018. Texts and Images are subject of copyright by the respective authors.
Hey! If you're reading this, why not help and update the description of the dataset you're working on? Add a new dataset! Yay!
«showing 697 tags of 697 total tags for 514 datasets (1.36) »
|478||UE4Sim and Sim4CV||Sim4CV is the general environment for simulating data for computer vision tasks, like object tracking, pose estimation, detection, action recognition, indoor sc...||synthetic object tracking, pose estimation, detection, action recognition, indoor scene understanding, multi-agent collaboration, autonomous navigation, 3d reconstruction, crowd understanding, urban scene understanding, human tracking, aerial surveying. simulation environment 3d photo-realistic realism depth segmentation urban rgb render||link||2019-03-25||1151|
|443||ApolloScape Semantic Segmentation||The ApolloScape Parsing dataset is provided by Baidu for the CVPR 2018 Workshop on Autonomous Driving Challenge. It is expected that the Scene Parsing dataset ...||segmentation semantic scene benchmark size urban autonomous driving camera calibration video||link||2019-03-08||1153|
|415||Total Text Dataset||In order to facilitate a new text detection research, we introduce the Total-Text dataset, which is more comprehensive than the existing text datasets. The Tota...||text detection, text recognition, scene text detection||link||2020-04-16||2282|
|397||MPI-I VISPR (Visual Privacy)||We present a dataset to address the problem of visual privacy - where users unintentionally leak private information when sharing personal images online, such a...||privacy multilabel classification flickr scene regression||link||2018-04-13||644|
|396||ADE20k||Scene Parsing Benchmark Scene parsing data and part segmentation data derived from ADE20K dataset could be download from MIT Scene Parsing Benchmark. mages ...||segmentation semantic annotation benchmark scene recognition||link||2017-08-03||829|
|376||ScanNet||ScanNet is an RGB-D video dataset containing 2.5 million views in more than 1500 scans, annotated with 3D camera poses, surface reconstructions, and instance-le...||scene indoor synthetic cad room layout rendering realism 3d segmentation object recognition||link||2017-05-12||938|
|375||SUNCG: Indoor Scenes||The SUNCG dataset is a Large 3D Model Repository for Indoor Scenes. SUNCG is an ongoing effort to establish a richly-annotated, large-scale dataset of 3D s...||scene indoor synthetic room layout rendering realism 3d segmentation object recognition||link||2020-05-15||2061|
|374||SceneNet RGB-D Synthetic Indoor||SceneNet RGB-D is dataset comprised of 5 million Photorealistic Images of Synthetic Indoor Trajectories with Ground Truth. It expands the previous work of Scene...||scene indoor synthetic robot navigation rendering 3d reconstruction trajectory lighting segmentation slam||link||2017-05-02||943|
|363||ETH/Yahoo Video2Gif dataset||The Video2GIF dataset contains over 100,000 pairs of GIFs and their source videos. The GIFs were collected from two popular GIF websites (makeagif.com, gifsoup....||highlight video summarization gif summary scene understanding||link||2017-09-12||696|
|338||MIT LaMem: Large-Scale Image Memorability Dataset||This database contains 60,000 images with memorability scores. The images come from a variety of datasets including SUN, COCO, image popularity, AVA, and severa...||memorability aesthetics object scene popularity||link||2020-03-09||1241|
|291||MIT Places205||Places205 dataase contains 2.5 million images from 205 scene categories for the academic public. The image dataset contains 2,448,873 images from 205 scene c...||place recognition urban scene feature learning||link||2016-02-24||1484|
|263||Crowd Dataset||The crowd datasets are collected from a variety of sources, such as UCF and data-driven crowd datasets. The sequences are diverse, representing dense crowd in t...||crowd video detection anomaly scene understanding human pedestrian||link||2017-09-19||3059|
|239||CUHK crowd dataset||CUHK crowd dataset introduces the largest publicly available crowd dataset of 474 videos from 215 crowded scenes. It has been used in the paper: Scene-Ind...||crowd analysis group detection analysis scene understanding dataset||link||2020-05-03||2568|
|234||UMD Dynamic Scene Recognition||The UMD Dynamic Scene Recognition dataset consists of 13 classes and 10 videos per class and is used to classify dynamic scenes. The dataset has been describ...||scene recognition classification dynamic video motion||link||2017-01-05||1471|
|228||MPI VehicleScenes||Abstract Scene understanding has (again) become a focus of computer vision research, leveraging advances in detection, context modeling, and tracking. In thi...||semantic segmentation scene understanding classification 3d car pedestrian||link||2014-06-10||2147|
|212||Polo Instance Segmentation||The Polo instance segmentation dataset is a semantic segmentation task for Hough transform based segmentation masks. It consists of supervised segmentation for ...||semantic segmentation horse human outdoor mask scene understanding||n/a||2016-01-21||1621|
|181||All I Have Seen (AIHS)||The All I Have Seen (AIHS) dataset is created to study the properties of total visual input in humans, for around two weeks Nebojsa Jojic wore a camera capturin...||video summary user study clustering similarity outdoor indoor scene 3d||link||2018-09-19||1361|
|101||CIFAR-10 / 100||The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images. ...||classification, tiny, color, patch, scene, object||link||2013-08-08||1261|
|20||CALTECH 256||The CALTECH 256 dataset by Li Fei-Fei contains 30607 images for 256 categories....||classification centered object scene image||link||2013-08-08||1429|
|19||CALTECH 101||The CALTECH 101 dataset by Li Fei-Fei contains images for 101 categories with about 40 to 800 images per category. Most categories have about 50 images at rough...||classification centered object scene image||link||2013-08-08||1441|