This website provides a list of frequently used computer vision datasets. Wait, there is more!
There is also a description containing common problems, pitfalls and characteristics and now a searchable TAG cloud.
Plus, this is open for crowd editing (if you pass the ultimate turing test)! - Questions? yacvid [at] hayko [dot] at
Content, Design and Idea © by Hayko Riemenschneider, 2011-2016. Texts and Images are subject of copyright by the respective authors.
Hey! If you're reading this, why not help and update the description of the dataset you're working on?
Add a new dataset
«showing 591 tags of 591 total tags for 426 datasets (1.39) »
|415||Total Text Dataset||In order to facilitate a new text detection research, we introduce the Total-Text dataset, which is more comprehensive than the existing text datasets. The Tota...||text detection, text recognition, scene text detection||link||2017-11-02||100|
|397||MPI-I VISPR (Visual Privacy)||We present a dataset to address the problem of visual privacy - where users unintentionally leak private information when sharing personal images online, such a...||privacy multilabel classification flickr scene regression||link||2017-08-08||95|
|396||ADE20k||Scene Parsing Benchmark Scene parsing data and part segmentation data derived from ADE20K dataset could be download from MIT Scene Parsing Benchmark. mages ...||segmentation semantic annotation benchmark scene recognition||link||2017-08-03||166|
|376||ScanNet||ScanNet is an RGB-D video dataset containing 2.5 million views in more than 1500 scans, annotated with 3D camera poses, surface reconstructions, and instance-le...||scene indoor synthetic cad room layout rendering realism 3d segmentation object recognition||link||2017-05-12||252|
|375||SUNCG: Indoor Scenes||The SUNCG dataset is a Large 3D Model Repository for Indoor Scenes. SUNCG is an ongoing effort to establish a richly-annotated, large-scale dataset of 3D s...||scene indoor synthetic room layout rendering realism 3d segmentation object recognition||link||2017-05-02||255|
|374||SceneNet RGB-D Synthetic Indoor||SceneNet RGB-D is dataset comprised of 5 million Photorealistic Images of Synthetic Indoor Trajectories with Ground Truth. It expands the previous work of Scene...||scene indoor synthetic robot navigation rendering 3d reconstruction trajectory lighting segmentation slam||link||2017-05-02||227|
|363||ETH/Yahoo Video2Gif dataset||The Video2GIF dataset contains over 100,000 pairs of GIFs and their source videos. The GIFs were collected from two popular GIF websites (makeagif.com, gifsoup....||highlight video summarization gif summary scene understanding||link||2017-09-12||276|
|291||MIT Places205||Places205 dataase contains 2.5 million images from 205 scene categories for the academic public. The image dataset contains 2,448,873 images from 205 scene c...||place recognition urban scene feature learning||link||2016-02-24||870|
|263||Crowd Dataset||The crowd datasets are collected from a variety of sources, such as UCF and data-driven crowd datasets. The sequences are diverse, representing dense crowd in t...||crowd video detection anomaly scene understanding human pedestrian||link||2017-09-19||1627|
|239||CUHK crowd dataset||CUHK crowd dataset introduces the largest publicly available crowd dataset of 474 videos from 215 crowded scenes. It has been used in the paper: Scene-Ind...||crowd analysis, group detection and analysis, scene understanding||link||2016-09-14||1366|
|234||UMD Dynamic Scene Recognition||The UMD Dynamic Scene Recognition dataset consists of 13 classes and 10 videos per class and is used to classify dynamic scenes. The dataset has been describ...||scene recognition classification dynamic video motion||link||2017-01-05||1005|
|228||MPI VehicleScenes||Abstract Scene understanding has (again) become a focus of computer vision research, leveraging advances in detection, context modeling, and tracking. In thi...||semantic segmentation scene understanding classification 3d car pedestrian||link||2014-06-10||1167|
|212||Polo Instance Segmentation||The Polo instance segmentation dataset is a semantic segmentation task for Hough transform based segmentation masks. It consists of supervised segmentation for ...||semantic segmentation horse human outdoor mask scene understanding||n/a||2016-01-21||1004|
|181||All I Have Seen (AIHS)||The All I Have Seen (AIHS) dataset is created to study the properties of total visual input in humans, for around two weeks Nebojsa Jojic wore a camera capturin...||video summary user study clustering similarity outdoor indoor scene 3d||link||2013-09-05||827|
|101||CIFAR-10 / 100||The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images. ...||classification, tiny, color, patch, scene, object||link||2013-08-08||884|
|20||CALTECH 256||The CALTECH 256 dataset by Li Fei-Fei contains 30607 images for 256 categories....||classification centered object scene image||link||2013-08-08||866|
|19||CALTECH 101||The CALTECH 101 dataset by Li Fei-Fei contains images for 101 categories with about 40 to 800 images per category. Most categories have about 50 images at rough...||classification centered object scene image||link||2013-08-08||878|