This website provides a list of frequently used computer vision datasets. Wait, there is more!
There is also a description containing common problems, pitfalls and characteristics and now a searchable TAG cloud.
Plus, this is open for crowd editing (if you pass the ultimate turing test)! - Questions? yacvid [at] hayko [dot] at
Content, Design and Idea © by Hayko Riemenschneider, 2011-2018. Texts and Images are subject of copyright by the respective authors.
Hey! If you're reading this, why not help and update the description of the dataset you're working on? Add a new dataset! Yay!
«showing 697 tags of 697 total tags for 511 datasets (1.36) »
|484||Flickr30k Entities||We propose to use the visual denotations of linguistic expressions (i.e. the set of images they describe) to define novel denotational similarity metrics, which...||phrase grounding caption text analysis image description flickr association video link||link||2019-01-23||273|
|302||CMP map2photo||The CMP map2photo dataset consists of 6 pairs, where one image is satellite photo and second image is a map of the same area. The task is to match these images...||feature detection description matching map remote sensing wide baseline||link||2015-08-13||1155|
|301||CMP Extreme Zoom Dataset||The Extreme Zoom Dataset. EZD is a 6 image sets with incleasing zoom factor from general scene view to focusing on single detail. MODS: Fast and Robust Metho...||feature detection description matching viewpoint zoom||link||2015-07-15||971|
|300||CMP WxBS dataset||The Wide (multiple) Baseline Dataset. 31 image pairs, simultaneously combining several nuisance factors: geometry, illumination, IR-visible, etc. WxBS: Wide ...||feature detection description matching viewpoint IR day night||link||2015-07-15||1755|
|224||CMP Extreme View Dataset||15 wide baseline stereo image pairs with large viewpoint change, provided ground truth homographies. Image size (~1000x700 pixels, RGB) D. Mishkin and M. ...||feature detection description matching viewpoint||link||2015-07-15||1386|
|223||SHOT 3D shape description||The 3D shape description dataset consists of multiple sub-datasets Descriptor Matching - Dataset 1 & 2 (Stanford) These datasets, created from some of the m...||3d shape description benchmark reconstruction registration matching||link||2015-06-21||1398|
|218||VidPairs||The VidPairs dataset contains 133 pairs of images, taken from 1080p HD (~2 megapixel) official movie trailers. Each pair consists of images of the same scene wi...||video pair matching patch description flow dense optical||link||2015-06-19||1155|
|53||DTU Robot||The DTU Robot dataset consists of color images of 60 scenes acquired in a controlled setup from 119 different positions and under different lighting. For each s...||feature, detection, description, matching, sfm, reconstruction, illumination||link||2016-05-15||1491|
|52||Graffiti||The Graffiti dataset by Krystian Mikolajczyk and Cordelia Schmid contains 48 images split into 8 sequences with 6 images each showing different structured and t...||feature, detection, description, rectification, benchmark||link||2019-07-10||1349|
|49||PhotoTourism Pair Patch||The data is taken from Photo Tourism reconstructions from Trevi Fountain (Rome), Notre Dame (Paris) and Half Dome (Yosemite). Each dataset consists of a series ...||feature matching description pair sfm patch learning||link||2018-01-10||1428|
|48||CALTECH 101 Category Patch Pairs||The CALTECH 101 Category Patch Pairs dataset measures invariance to intra-category variation. The dataset contains a training set and testing set of image patc...||feature, matching, description, pair, binary||link||2017-02-14||3553|