Yet Another Computer Vision Index To Datasets (YACVID)

This website provides a list of frequently used computer vision datasets. Wait, there is more!
There is also a description containing common problems, pitfalls and characteristics and now a searchable TAG cloud.
Plus, this is open for crowd editing (if you pass the ultimate turing test)! - Questions? yacvid [at] hayko [dot] at

Content, Design and Idea © by Hayko Riemenschneider, 2011-2016. Texts and Images are subject of copyright by the respective authors.

Hey! If you're reading this, why not help and update the description of the dataset you're working on?

Add a new dataset



2d   3d   4d   aachen   abdomen   abrupt   accelerometer   accuracy   action   activity   address   adhead   adjustment   adult   aerial   aesthetics   affordance   age   aircraft   airplane   airport   alignment   amazon   ambiguous   analysis   anger   animal   animation   annotation   anomaly   apartment   api   appearance   applelogo   architecture   articulation   artificial   aspect   atmospheric   attention   attribute   attributes   authentication   automatic   autonomous   avoid   axis   babyface   background   balance   baseline   behavior   belgium   benchmark   benchmarking   berlin   bike   bilateral   bim   binary   biology   biometric   biometry   blender   blur   boat   body   bone   bottle   boundingbox   brain   brand   bremen   buffy   building   bullseye   bundle   bunny   byu   cad   calibration   california   caltech   camera   canada   caption   captioning   capture   car   cardinal   categorization   category   cats   cbir   celebrity   cell   centered   chair   challenge   change   chemistry   chest   chicaco   chromaticity   church   circle   city   cityscapes   classification   clothing   cloud   clustering   clutter   cnn   co-localization   co-saliency   co-segmentation   co-skeletonization   coco   code   codebook   coffee   collaborative   color   community   comparison   computer   condition   constancy   context   contour   cooking   copyright   counting   cover   cow   crepe   crf   crop   cross-view   crowd   ct   cutting   daily   dance   data   dataset   day   daylight   decomposition   deep   defocus   deformation   denoising   dense   depth   description   descriptor   detail   detection   dichromatic   disease   disgust   disparity   dogs   domain   dped   driving   drone   dubrovnik   duplicate   dynamic   ear   edge   egocentric   ellipse   emotion   endtoend   enhancement   estimation   evaluation   event   expertise   expression   eye   facade   face   facial   fake   fashion   fear   feature   field   fine-grained   fingerprint   fingertip   first-person   fish   fisheye   fitting   flickr   flight   floorplan   flow   fly   flying   food   foot   footprint   foreground   fov   frames   frontview   fundus   gait   game   gan   gaze   gender   genetic   genome   geography   geometry   geoscience   geotag   geotagged   germany   gesture   getry   gif   giraffe   gis   global   google   gps   grammar   graphics   grayscale   graz   ground   groundtruth   group   growth   gsd   hand   handwritten   hd   head   heart   heat   hierarchy   high-definition   high-resolution   highlight   highway   holes   horse   house   howto   human   identification   illumination   illuminiation   illusion   image   imagenet   images   imdb   indoor   inertial   initialization   inserts   instance   intake   interaction   interactive   interest   internet   invariance   ir   isar   iso   joy   kernels   keyframe   kimia   kinect   kitchen   kitti   label   labeling   laboratory   land   landmark   lane   language   large   large-scale   laser   lattice   layout   leaf   learning   letter   leuven   lidar   lifespan   light   lightfield   lighting   limited   line   lip   lisbon   liver   local   localization   location   logo   lowlevel   machine   makeup   manhattan   map   maritime   mask   match   matching   material   medial   medical   medicine   memorability   mesh   metadata   milling   mirror   mobile   model   modeling   monitoring   mono   montage   motion   motorbike   mouse   mouth   movement   movie   mpeg   mser   mug   multi-camera   multi-class   multi-human   multi-mode   multi-sensor   multi-spectral   multi-view   multilabel   multimedia   multimodal   multiple   multitarget   multiview   naming   natural   nature   navigation   netherlands   network   neutral   newyork   night   nir   noise   normal   nude   number   object   occlusion   ocr   odometry   omnidirection   omnidirectional   online   open-view   operation   optical   optimization   organ   original   osnabrueck   outdoor   overhead   overlap   oxford   pair   pairwise   pan   panorama   panoramio   parallel   paris   parsing   part   partial   pasadena   pascal   patch   path   pattern   pedestrian   pedestrians   people   person   perspective   phase   photo   photogrammetry   physics   pittsburgh   place   plane   planning   point   pointcloud   polygon   popularity   pornography   pose   potsdam   presentation   pressure   primitive   privacy   procedural   profile   project   proposal   pruning   ptz   quality   question   radar   random   rank   ranking   ransac   rate   ratio   re-identification   reading   real   real-world   realism   recipe   recognition   reconstruction   rectification   rectified   reflection   registration   regression   regular   remote   removal   rendering   repetition   resolution   restoration   retina   retinal   retrieval   rgb   rgbd   road   robot   robotic   robust   rome   room   ros   rotation   sad   saliency   sampling   sanfrancisco   satellite   scale   scan   scanner   scene   search   segmentation   selfdriving   semantic   sense   sensing   sequence   series   sfm   shadow   shape   sheffield   shoes   shots   shutter   sideview   sign   signs   similarity   simultaneous   single   singleview   size   skeleton   skeletonization   sketch   skin   sky   slam   smartphone   soccer   social   software   source   space   spain   spanish   speaker   speech   sphere   sport   stability   stabilization   static   stationary   stereo   stereovision   stochastic   street   structure   structured   study   stuff   style   stylization   subpixel   subtraction   summarization   summary   superpixel   superresolution   supervised   supervisely   surface   surgery   surprise   surveillance   swan   switzerland   sydney   symmetry   synthetic   table   target   taxonomy   temporal   text   textile   texture   texture-less   therapy   thermal   things   time   timelapse   tiny   tokyo   tool   tools   top-view   tracking   tracklet   traffic   trajectory   transfer   transportation   trees   triangulation   truth   tuberculosis   turbulence   type   uas   uav   udacity   ultrasound   understanding   uneven   unmanned   unsupervised   urban   user   vanishing   variation   vehicle   vehicles   vessel   video   view   viewpoint   virtual   visible   vision   visual   voc   volleyball   vqa   vt   water   wavelength   weakly   wear   wearable   weather   webcam   white   wide   wiki   wikipedia   wild   workflow   world   worldwide   xray   year   youtube   zoom   zurich  
«showing 641 tags of 641 total tags for 459 datasets (1.4) »


semantic
DID Name Description Tags URL Date Views
444 Supervisely Person Dataset The Supervisely Person Dataset consists of 5711 images with 6884 high-quality annotated person instances. All steps below are done inside Supervisely without a... person pedestrian segmentation semantic mask supervisely annotation automatic dataset instance link 2018-04-15 237
443 ApolloScape Semantic Segmentation The ApolloScape Parsing dataset is provided by Baidu for the CVPR 2018 Workshop on Autonomous Driving Challenge. It is expected that the Scene Parsing dataset ... segmentation semantic scene benchmark size urban autonomous driving camera calibration link 2018-04-25 116
442 YouTube Co-localization Dataset (ECCV + IEEE Trans. CSVT papers) [GEU and NTU] The dataset consists of bounding box annotations for 15k frames of videos collected from YouTube Objects Dataset. If you find this dataset useful, kindly ci... Co-localization Co-segmentation Co-saliency Video CATS Tracklet Benchmark Binary Object Retrieval Segmentation Semantic Similarity Tracking Matching Localization link 2018-03-21 107
430 WildDash Benchmark This website provides a dataset and benchmark for semantic and instance segmentation. We aim to improve the expressiveness of performance evaluation for compute... semantic instance segmentation Cityscapes weather link 2018-03-09 82
427 CITY-OSM - ETH Zurich # Learning Aerial Image Segmentation From Online Maps This is the ground truth data generated for the publication Learning Aerial Image Segmentation F... semantic computer vision aerial image segmentation map geoscience remote sensing deep learning berlin chicaco paris potsdam tokyo zurich link 2018-01-25 182
421 TUD Dynamic scenes dataset The dynamic scenes dataset contains image sequences consisting of overall 1936 images. The images are taken from a camera inside a driving car and mainly show r... semantic segmentation dynamic urban road street object driving crf link 2017-12-12 220
419 UC Merced Land Use Dataset The UC Mercet dataset is a 21 class land use image dataset meant for research purposes. There are 100 images for each of the following classes: agricultural ... semantic segmentation classification aerial land building urban link 2017-11-28 232
407 Inria Aerial Image Labeling The Inria Aerial Image Labeling addresses a core topic in remote sensing: the automatic pixelwise labeling of aerial imagery (link to paper). Dataset feature... semantic segmentation aerial urban city groundtruth building footprint house link 2018-03-22 371
405 SydneyHouse HouseCraft In HouseCraft, we utilize rental ads to create realistic textured 3D models of building exteriors. In particular, we exploit the address of the property and its... house urban building city floorplan street semantic segmentation localization registration google sydney link 2018-01-28 286
404 Zurich Summer Dataset The Zurich Summer v1.0 dataset is a collection of 20 chips (crops), taken from a QuickBird acquisition of the city of Zurich (Switzerland) in August 2002. Quick... satellite segmentation semantic aerial urban city zurich pan nir rgb gsd superpixel annotation link 2017-09-12 274
396 ADE20k Scene Parsing Benchmark Scene parsing data and part segmentation data derived from ADE20K dataset could be download from MIT Scene Parsing Benchmark. mages ... segmentation semantic annotation benchmark scene recognition link 2017-08-03 285
394 Matterport 2D-3D-Semantics Data The 2D-3D-S dataset provides a variety of mutually registered modalities from 2D, 2.5D and 3D domains, with instance-level semantic and geometric annotations. I... 3d panorama semantic segmentation depth normal indoor building reconstruction large-scale link 2017-07-27 376
353 COCO-Stuff COCO-Stuff augments the COCO dataset with pixel-level stuff annotations for 10,000 images. These annotations can be used for scene understanding tasks like sema... semantic segmentation stuff things COCO captioning annotation groundtruth benchmark link 2017-02-16 765
334 LabelMeFacade The LabelMeFacade dataset contains buildings, windows, sky and a limited number of unlabeled regions (maximally 20% covering of the image). This procedure res... segmentation semantic facade urban rectified recognition link 2016-08-23 806
330 Cityscapes We present a new large-scale dataset that contains a diverse set of stereo video sequences recorded in street scenes from 50 different cities, with high quality... stereo video urban city semantic segmentation detection car person pedestrian weakly link 2018-03-22 1328
315 Geosemantic The Geosemantic is a dataset of object locations from GIS and a query image with metadata. It is used to project the buildings and streets that are in the field... semantic segmentation gps geography supervised gis link 2016-01-07 657
283 ISPRS WG III/4 ISPRS Test Project on Urban Classification, 3D Building Reconstruction and Semantic Labeling. In this part of our working group site you will get further inform... aerial multiview 3d photogrammetry germany canada semantic segmentation urban city recognition benchmark link 2015-06-16 841
269 Daimler Urban Segmentation Dataset The Daimler Urban Segmentation Dataset consists of video sequences recorded in urban traffic. The dataset consists of 5000 rectified stereo image pairs with a r... semantic segmentation outdoor urban stereo motion link 2015-06-26 1464
266 Paris Art Deco Facades The Paris Art Deco Facades dataset consists of 79 / 80 images of rectified facades of the architectural style Art Deco, which has different sizes of windows, de... paris semantic segmentation recognition architecture facade urban city procedural grammar link 2015-01-20 857
251 ETHZ CVL RueMonge 2014 This ETHZ CVL RueMonge 2014 dataset used for 3D reconstruction and semantic mesh labelling for urban scene understanding. It was first published in [1] and p... semantic segmentation 3d reconstruction architecture paris benchmark source code urban recognition classification outdoor pointcloud mesh link 2014-11-24 1595
249 Image Sequence Analysis Test Site (EISATS) The .enpeda.. Image Sequence Analysis Test Site (EISATS) offers sets of long bi- or trinocular image sequences recorded in the context of vision-based driver as... stereo vision optical flow motion analysis semantic segmentation link 2014-09-30 1211
247 PASCAL VOC Parts The PASCAL VOC is augmented with segmentation annotation for semantic parts of objects. For example, for the person category, we provide segmentation mask for 2... detection recognition pascal object part pedestrian human segmentation semantic link 2014-09-30 1451
240 Microsoft COCO The Microsoft COCO (mscoco) is an image recognition and segmentation dataset which contains more 300k images for more than 70 categories. Other features: Mo... object context segmentation detection recognition benchmark semantic link 2015-05-02 1650
233 PASCAL Context We would like to announce the release of PASCAL-Context dataset. We augmented PASCAL VOC 2010 dataset with annotations for 400+ additional categories. In the cu... semantic segmentation pascal benchmark category recognition dense shape link 2014-07-17 1104
229 Paris Rue Madame Paris-rue-Madame dataset contains 3D Mobile Laser Scanning (MLS) data from rue Madame, a street in the 6th Parisian district (France). The test zone contains ap... semantic segmentation pointcloud 3d laser classification link 2014-06-10 857
228 MPI VehicleScenes Abstract Scene understanding has (again) become a focus of computer vision research, leveraging advances in detection, context modeling, and tracking. In thi... semantic segmentation scene understanding classification 3d car pedestrian link 2014-06-10 1308
212 Polo Instance Segmentation The Polo instance segmentation dataset is a semantic segmentation task for Hough transform based segmentation masks. It consists of supervised segmentation for ... semantic segmentation horse human outdoor mask scene understanding n/a 2016-01-21 1128
206 GaTech VideoContext The GaTech VideoContext dataset consists of over 100 groundtruth annotated outdoor videos with over 20000 frames for the task of geometric context evaluation i... video geometry context classification semantic segmentation unsupervised supervised outdoor urban nature link 2014-04-06 1094
197 Stanford Background Dataset The Stanford Background Dataset is a new dataset introduced in Gould et al. (ICCV 2009) for evaluating methods for geometric and semantic scene understanding. T... semantic segmentation urban classification nature geometry link 2016-01-21 2059
195 Yotta The Yotta dataset consists of 70 images for semantic labeling given in 11 classes. It also contains multiple videos and camera matrices for 14km or driving. ... semantic segmentation urban video camera 3d reconstruction classification link 2013-09-30 1069
179 CMP Facades The CMP Facade dataset consists of facade images assembled at the Center for Machine Perception, which includes 600 rectified images of facades from various sou... facade rectification urban semantic classification recognition structure similarity segmentation link 2015-06-19 906
149 NYU Depth v2 The NYU-Depth V2 data set is comprised of video sequences from a variety of indoor scenes as recorded by both the RGB and Depth cameras from the Microsoft Kinec... semantic segmentation depth kinect label reconstruction link 2017-06-01 2392
148 NYU Depth v1 The NYU-Depth data set is comprised of video sequences from a variety of indoor scenes as recorded by both the RGB and Depth cameras from the Microsoft Kinect. ... semantic segmentation depth kinect label reconstruction link 2014-10-05 1316
134 Image Memorability Image memorability dataset contains target and filler images, precomputed features and annotations, and memorability. It gives features and annotations for t... aesthetics, semantic, quality, memorability link 2013-04-17 990
132 Aesthetic Visual Analysis Aesthetic Visual Analysis (AVA) dataset studies the organization of content by aesthetic preference. It contains over 250,000 images along with a rich variety o... aesthetics, semantic, quality, memorability link 2017-01-10 1504
126 ISPRS Urban Classification ISPRS Test Project on Urban Classification and 3D Building Reconstruction The ISPRS working group III/4 announces the release of the 2D semantic labeling ben... 3d, reconstruction, building, urban, city, semantic, classification, recognition link 2014-11-24 952
123 CMU/VMR Urban Image+Laser CMU/VMR Urban Image+Laser dataset contains 372 images linked with 3D laser points projections. There are additional images (due to the laser scanner being turne... reconstruction, sfm, urban, semantic, segmentation, laser link 2013-04-02 1193
121 Oakland 3D This repository contains labeled 3-D point cloud laser data collected from a moving platform in a urban environment. Data are provided for research purposes. ... reconstruction, sfm, urban, semantic, segmentation, laser link 2014-06-10 1175
100 Sowerby The Sowerby dataset contains 105 images for semantic segmentation.... semantic, segmentation, outdoor n/a 2014-09-26 1064
90 eTrims The eTrims dataset is comprised of two datasets, the 4-Class eTRIMS Dataset with 4 annotated object classes and the 8-Class eTRIMS Dataset with 8 annotated obje... semantic, segmentation, urban, reconstruction link 2013-03-12 887
89 Corel Photo Gallery This image database is a part of the "Corel Gallery Magic" (commercial product). It contains 80000 images divided into 800 categories of 100 images. These image... semantic, segmentation, outdoor n/a 2017-01-19 931
86 ICG Graz240 The ICG Graz240 dataset consists of 240 buildings with 5400 redundant images with a total of 5542 window instances. Window detection itself is difficult due to ... segmentation, detection, semantic, urban, graz link 2016-03-29 1189
81 Zurich Hoengg Zurich Hoengg (Switzerland) is an aerial dataset. The dataset consists of 4 aerial images in colour (Figures 2-5), scanned with 14 microns, the format is Ti... aerial, semantic, segmentation, outdoor link 2013-03-11 983
79 LabelMe The goal of LabelMe is to provide an online annotation tool to build image databases for computer vision research. You can contribute to the database by visitin... segmentation, semantic, outdoor, detection, urban, software link 2013-03-14 994
68 The KITTI Vision Benchmark Suite We take advantage of our autonomous driving platform Annieway to develop novel challenging real-world computer vision benchmarks. Our tasks of interest are: ste... stereo, depth, flow, detection tracking, reconstruction, sfm, odometry, segmentation, semantic car depth link 2017-11-26 1581
39 Leuven Stereo Scene The Leuven Stereo Scene dataset is a scene and depth dataset. There exist two variants of this dataset - a CVPR 2007 paper [1] by Leibe et al. for detection and... segmentation, semantic, reconstruction, urban, sfm, 3d, leuven, depth, stereo link 2018-04-20 2107
37 MSRC vNIPS The MSRC vNIPS dataset is the MSRC v2 dataset with new annotations for much more accurate segmentations for 93 images. Efficient Inference in Fully Connected... segmentation, semantic, outdoor link 2013-03-11 857
36 MSRC v2 The MSRC v2 dataset is an extension of the MSRC v1 dataset from Microsoft Research in Cambridge. It contains 591 images and 23 object classes with accurate pixe... segmentation, semantic, outdoor link 2016-08-28 2314
35 MSRC v1 The MSRC v1 dataset from Microsoft Research in Cambridge contains 240 images and 9 object classes with coarse pixel-wise labeled images. The dataset is commonl... segmentation, semantic, outdoor link 2016-09-07 1824
34 CamVid The Cambridge-driving Labeled Video Database (CamVid) dataset from Gabriel Brostow [?] contains ten minutes of video footage and corresponding semantically labe... sfm, depth, semantic, segmentation, urban link 2016-04-18 3911
33 ECP New York 2011 The ECP New York dataset contains 10 manually segmented buildings from New York City, USA. Segmentation evaluating using Dice coefficient is calculated for the ... segmentation, semantic, procedural, reconstruction, urban, newyork link 2013-08-08 794
32 ECP Paris 2011 The ECP Paris 2011 dataset consists of 104 images taken from rue Monge in the fifth district of Paris, we kept only 20 for training and 10 for testing. Howev... segmentation, semantic, procedural, reconstruction, urban, paris link 2013-08-08 827
31 ECP Paris 2010 The Ecole Centrale Paris 2010 (Paris 2010) dataset consists of 30 images of densely annotated building facades in seven classes - wall, window, sky, shop, balco... segmentation, semantic, procedural, reconstruction, urban, paris link 2013-03-11 993
30 ICG Graz50 This is a dataset of rectified facade images and semantic labels. The goal of the annotation is to study the layout of the facades. It contains 50 images of... segmentation, semantic, procedural, reconstruction, urban, graz link 2014-01-28 1047


total views: 56743 5 queries in 0.00010299682617188s 0.00013208389282227s 0.00017213821411133s 6.0081481933594E-5s 0.0012869834899902s and total 0.0076041221618652s