This website provides a list of frequently used computer vision datasets. Wait, there is more!
There is also a description containing common problems, pitfalls and characteristics and now a searchable TAG cloud.
Plus, this is open for crowd editing (if you pass the ultimate turing test)! - Questions? yacvid [at] hayko [dot] at
Content, Design and Idea © by Hayko Riemenschneider, 2011-2018. Texts and Images are subject of copyright by the respective authors.
Hey! If you're reading this, why not help and update the description of the dataset you're working on? Add a new dataset! Yay!
«showing 684 tags of 684 total tags for 487 datasets (1.4) »
|355||IMPART multi-modal/multi-view||The multi-modal/multi-view datasets are created in a cooperation between University of Surrey and Double Negative within the EU FP7 IMPART project. The sourc...||multi-view multi-mode video rgbd lidar 3d model color indoor outdoor dynamic action face human emotion||link||2017-01-01||653|
|269||Daimler Urban Segmentation Dataset||The Daimler Urban Segmentation Dataset consists of video sequences recorded in urban traffic. The dataset consists of 5000 rectified stereo image pairs with a r...||semantic segmentation outdoor urban stereo motion||link||2015-06-26||1659|
|260||Eurasian Cities dataset||The Eurasian Cities dataset contains 103 images of outdoor urban scenes taken in Eurasian cities. It is annotated with horizontal and vertical vanishing points ...||vanishing line point geometry pose urban reconstruction outdoor manhattan||link||2018-01-11||1352|
|251||ETHZ CVL RueMonge 2014||This ETHZ CVL RueMonge 2014 dataset used for 3D reconstruction and semantic mesh labelling for urban scene understanding. It was first published in  and p...||semantic segmentation 3d reconstruction architecture paris benchmark source code urban recognition classification outdoor pointcloud mesh||link||2014-11-24||1834|
|212||Polo Instance Segmentation||The Polo instance segmentation dataset is a semantic segmentation task for Hough transform based segmentation masks. It consists of supervised segmentation for ...||semantic segmentation horse human outdoor mask scene understanding||n/a||2016-01-21||1258|
|206||GaTech VideoContext||The GaTech VideoContext dataset consists of over 100 groundtruth annotated outdoor videos with over 20000 frames for the task of geometric context evaluation i...||video geometry context classification semantic segmentation unsupervised supervised outdoor urban nature||link||2018-12-07||1247|
|191||Daimler Mono Pedestrian Classification Benchmark||The Daimler Mono Pedestrian Classification Benchmark dataset consists of two parts: a base data set. The base data set contains a total of 4000 pedestrian- a...||pedestrian classification outdoor urban object scale illumination||link||2013-09-18||1169|
|190||Daimler Mono Pedestrian Detection Benchmark||The Daimler Mono Pedestrian Detection Benchmark dataset contains a large training and test set. The training set contains 15.560 pedestrian samples (image cut-o...||pedestrian detection outdoor urban mono scale object||link||2013-09-18||1291|
|188||KTH Multiview Football||The KTH Multiview Football dataset contains 771 images of football players includes images taken from 3 views at 257 time instances 14 annotated body joints. ...||multiview pedestrian tracking detection object camera outdoor game soccer pose recognition multitarget||link||2018-06-28||2074|
|181||All I Have Seen (AIHS)||The All I Have Seen (AIHS) dataset is created to study the properties of total visual input in humans, for around two weeks Nebojsa Jojic wore a camera capturin...||video summary user study clustering similarity outdoor indoor scene 3d||link||2018-09-19||1074|
|165||ICG Multi-Camera and Virtual PTZ||The ICG Multi-Camera and Virtual PTZ dataset contains the video streams and calibrations of several static Axis P1347 cameras and one panoramic video from a sph...||multiview pedestrian tracking detection object camera calibration graz network video panorama crowd outdoor multitarget||link||2017-08-19||1732|
|117||YorkUrbanDB||The York Urban Line Segment Database is a compilation of 102 images (45 indoor, 57 outdoor) of urban environments consisting mostly of scenes from the campus of...||vanishing, point, pose, urban, reconstruction, outdoor, geometry, manhattan||link||2013-09-18||964|
|104||Make3D Depth||The Make3D Depth dataset s designed to learn features to estimate scene depth from a single image. This dataset contains aligned image and range data: Make3...||depth, learning, single view, outdoor, indoor||link||2018-03-16||1757|
|100||Sowerby||The Sowerby dataset contains 105 images for semantic segmentation....||semantic, segmentation, outdoor||n/a||2014-09-26||1168|
|93||Street View Text||The Street View Text (SVT) dataset contains 647 words and 3796 letters in 249 images harvested from Google Street View. The dataset is more challenging becaus...||text, detection, recognition, classification, outdoor, urban||link||2014-01-13||1430|
|89||Corel Photo Gallery||This image database is a part of the "Corel Gallery Magic" (commercial product). It contains 80000 images divided into 800 categories of 100 images. These image...||semantic, segmentation, outdoor||n/a||2017-01-19||1065|
|81||Zurich Hoengg||Zurich Hoengg (Switzerland) is an aerial dataset. The dataset consists of 4 aerial images in colour (Figures 2-5), scanned with 14 microns, the format is Ti...||aerial, semantic, segmentation, outdoor||link||2013-03-11||1122|
|79||LabelMe||The goal of LabelMe is to provide an online annotation tool to build image databases for computer vision research. You can contribute to the database by visitin...||segmentation, semantic, outdoor, detection, urban, software||link||2013-03-14||1157|
|72||Acute3D Aiguille du Midi MVS||Aiguille du Midi. France showing photographs with Camera: Mamiya ZD. 55mm. - Resolution: 5Mpixels, 53 images - Photographer: B. Vallet (Imagine/EVD - 2006) ...||sfm, reconstruction, mesh, large scale, outdoor||link||2013-03-21||1247|
|69||HCI Robust Vision||Estimate robust and reliable depth or motion fields on our challenging real world videos! ...||flow, depth, stereo, outdoor||link||2016-09-08||1242|
|37||MSRC vNIPS||The MSRC vNIPS dataset is the MSRC v2 dataset with new annotations for much more accurate segmentations for 93 images. Efficient Inference in Fully Connected...||segmentation, semantic, outdoor||link||2013-03-11||1008|
|36||MSRC v2||The MSRC v2 dataset is an extension of the MSRC v1 dataset from Microsoft Research in Cambridge. It contains 591 images and 23 object classes with accurate pixe...||segmentation, semantic, outdoor||link||2016-08-28||2681|
|35||MSRC v1||The MSRC v1 dataset from Microsoft Research in Cambridge contains 240 images and 9 object classes with coarse pixel-wise labeled images. The dataset is commonl...||segmentation, semantic, outdoor||link||2016-09-07||2113|
|16||PETS 2009||The PETS 2009 dataset contains 3 parts showing multi-view sequences containing pedestrians walking in an outdoor environment. The parts are used for person coun...||frontview, outdoor, pedestrian, detection, tracking, overlap, occlusion multitarget, human||link||2018-11-30||1958|