Yet Another Computer Vision Index To Datasets (YACVID)

This website provides a list of frequently used computer vision datasets. Wait, there is more!
There is also a description containing common problems, pitfalls and characteristics and now a searchable TAG cloud.
Plus, this is open for crowd editing (if you pass the ultimate turing test)! - Questions? yacvid [at] hayko [dot] at

Content, Design and Idea © by Hayko Riemenschneider, 2011-2016. Texts and Images are subject of copyright by the respective authors.

Hey! If you're reading this, why not help and update the description of the dataset you're working on?

Add a new dataset



2d   3d   4d   aachen   abdomen   abrupt   accelerometer   action   actions   activities   activity   address   adhead   adjustment   aerial   aesthetics   age   aircraft   airplane   airport   alignment   amazon   ambiguous   analysis   and   anger   animal   animation   annotation   anomaly   apartment   api   appearance   applelogo   architecture   articulation   aspect   attention   attribute   attributes   authentication   automatic   autonomous   avoid   axis   babyface   background   balance   baseline   behavior   belgium   benchmark   benchmarking   bike   bilateral   binary   biology   biometric   biometry   blender   blur   boat   body   bone   bottle   boundingbox   brand   bremen   buffy   building   bullseye   bundle   bunny   byu   cad   calibration   california   caltech   camera   canada   captioning   captions   capture   car   cardinal   categorization   category   celebrity   cell   centered   chair   challenge   change   chemistry   chest   chromaticity   church   circle   cities   city   classification   clothing   clustering   clutter   cnn   co-segmentation   co-skeletonization   coco   code   codebook   coffee   color   community   comparison   computer   conditions   constancy   context   contour   cooking   copyright   cosegmentation   counting   cover   cow   crepe   cross-view   crowd   ct   cutting   daily   dance   data   dataset   day   daylight   decomposition   deep   defocus   deformation   dense   depth   description   descriptor   detail   detection   dichromatic   disgust   disparity   dogs   domain   dped   driving   drone   dubrovnik   duplicate   dynamic   ear   edge   egocentric   ellipse   emotion   endtoend   enhancement   estimation   evaluation   event   expression   eye   facade   face   facial   fashion   fear   feature   field   fine-grained   fingerprint   fingertip   first-person   fish   fisheye   fitting   flickr   flight   floorplan   flow   fly   flying   food   foot   footprint   foreground   fov   frames   frontview   fundus   gait   game   gan   gaze   gender   genetic   genome   geography   geometry   geotag   geotagged   germany   gesture   getry   gif   giraffe   gis   global   google   gps   grammar   graphics   graz   ground   groundtruth   group   gsd   hand   handwritten   hd   head   heart   heat   hierarchy   high-definition   highlight   highway   holes   horse   house   human   humans   identification   illumination   image   imagenet   images   imdb   indoor   inertial   initialization   inserts   instance   intake   interaction   interactive   interest   internet   invariance   ir   isar   joy   kernels   keyframe   kimia   kinect   label   labeling   laboratory   land   landmark   lane   language   large   large-scale   laser   lattice   layout   learning   letter   leuven   lidar   light   lightfield   lighting   limited   line   lip   lisbon   liver   local   localization   location   logo   lowlevel   machine   manhattan   map   maritime   mask   match   matching   material   medial   medical   medicine   memorability   mesh   metadata   milling   mirror   mobile   model   modeling   modelling   monitoring   mono   montage   motion   motion-capture-data   motorbike   mouse   mouth   movement   movie   mpeg   mug   multi-camera   multi-class   multi-human   multi-mode   multi-sensor   multi-spectral   multi-view   multilabel   multimodal   multiple   multitarget   multiview   naming   natural   nature   navigation   network   neutral   newyork   night   nir   noise   normal   nude   number   object   objects   occlusion   ocr   odometry   omnidirection   omnidirectional   open-view   operation   optical   optimization   organ   original   osnabrueck   outdoor   overhead   overlap   oxford   pair   pairwise   pan   panorama   panoramio   parallel   paris   parsing   part   partial   pasadena   pascal   patch   path   pattern   pedestrian   people   person   perspective   phase   photo   photogrammetry   physics   pittsburgh   place   plane   planning   point   pointcloud   polygon   popularity   pornography   pose   presentation   pressure   primitive   privacy   procedural   profile   proposal   ptz   quality   question   radar   random   rank   ranking   ransac   rate   ratio   re-identification   reading   real   realism   recipe   recognition   reconstruction   rectification   rectified   reflection   registration   regression   regular   reidentification   remote   removal   rendering   repetition   resolution   retina   retinal   retrieval   rgb   rgb-d   rgbd   road   robot   robust   rome   room   ros   rotation   sad   saliency   sampling   sanfrancisco   satellite   scale   scan   scanner   scene   scenes   search   segmentation   selfdriving   semantic   sense   sensing   sequence   sfm   shadow   shadows   shape   sheffield   shoes   shots   shutter   sideview   sign   similarity   simultaneous   single   singleview   skeleton   skeletonization   sketch   skin   sky   slam   soccer   social   software   source   space   spain   spanish   speaker   speech   sphere   sport   stability   stabilization   static   stationary   stereo   stereovision   stochastic   street   structure   structured   study   stuff   stylization   subpixel   subtraction   summarization   summary   superpixel   superresolution   supervised   surface   surgery   surprise   surveillance   swan   switzerland   sydney   symmetry   synthetic   table   target   taxonomy   temporal   text   texture   texture-less   therapy   thermal   things   time   time-series   tiny   tool   tools   top-view   tracking   traffic   trajectory   transfer   transportation   triangulation   truth   tuberculosis   type   uas   uav   udacity   ultrasound   understanding   uneven   unmanned   unsupervised   urban   user   vanishing   variation   vehicle   vessel   video   view   viewpoint   visible   vision   visual   volleyball   vqa   vt   water   wavelength   weakly   wear   wearable   weather   webcam   white   wide   wikipedia   wild   workflow   world   xray   year   zoom   zurich  
«showing 591 tags of 591 total tags for 421 datasets (1.4) »


sfm
DID Name Description Tags URL Date Views
399 Osnabruck - Synthetic Scalable Cube Dataset Voxel Based Dataset for Systematic 3D reconstruction by artificial neural networks (ANNs). A synthetic scalable cube dataset for training, testing and valida... 3D, Deep Learning, Reconstruction, SfM, Synthetic city urban link 2017-09-12 94
349 HKUST Ambiguity Dataset This dataset contains two image collections, TempleOfHeaven and SportsArena, that are deemed hard for Structure-from-Motion (SfM). The method is described i... Structure Motion, Ambiguous structure sfm link 2017-11-28 318
222 Ford Car Dataset The Ford Car dataset is joint effort of Pandey et al. (for collecting images, Lidar points, calibration etc.) and us (for annotation of 2D and 3D objects). ... car detection lidar 3d groundtruth sfm link 2014-04-16 1881
186 Symmetry Facades The Symmetry Facades dataset contains 9 building facades with multiple images. It used for coupled symmetry and structure from motion detection. Coupled Str... symmetry facade building urban reconstruction sfm 3d repetition link 2013-09-05 1137
152 Colosseum and San Marco The Colosseum and San Marco are two image datasets for dense multiview stereo reconstructions used for evaluating the visual photo realism. The datasets are ... 3d, reconstruction, landmark, urban, sfm, aerial, street, flickr link 2017-11-28 1341
135 Quad 6K The Quad 6K dataset is a Structure-from-Motion dataset taken at Arts Quad at Cornell University campus and consists of 6514 images with ground truth positions o... reconstruction, sfm, urban, groundtruth, landmark, 3d gps link 2013-11-05 1096
131 Dubrovnik6K and Rome16K The Dubrovnik6K and Rome16K datasets are image collections for SfM reconstruction, where the suffix refers to the number of images in the dataset. Dubrovnik6... reconstruction, sfm, urban, landmark, dubrovnik, rome link 2017-03-10 1029
127 Stable Structure from Motion The Stable Structure from Motion datasets due to size limitations cannot put the images online. Instead here are the tracked image points and the final reconstr... sfm, reconstruction, geometry, stability, robust, 3d, landmark, church link 2013-08-08 1286
125 Google Street View Pittsburgh Research The Google Street View Pittsburgh Research dataset is a street-level image collection provided by Google for research purposes. The dataset provided here co... 3d, reconstruction, sfm, urban, pittsburgh, panorama link 2017-08-04 2114
123 CMU/VMR Urban Image+Laser CMU/VMR Urban Image+Laser dataset contains 372 images linked with 3D laser points projections. There are additional images (due to the laser scanner being turne... reconstruction, sfm, urban, semantic, segmentation, laser link 2013-04-02 1027
122 Symmetric Bundle Adjustment The Symmetric Bundle Adjustment dataset contains four sequences of the CAB building, Barcelona, Redmond and Capitole for 3D reconstruction considering symmetrie... reconstruction, sfm, urban, bundle adjustment, symmetry link 2013-03-12 902
121 Oakland 3D This repository contains labeled 3-D point cloud laser data collected from a moving platform in a urban environment. Data are provided for research purposes. ... reconstruction, sfm, urban, semantic, segmentation, laser link 2014-06-10 999
120 Samantha The SAMANTHA (Structure-and-Motion Pipeline on a Hierarchical Cluster Tree) dataset contains 4 sequences for 3D reconstruction: Pozzoveggiani, Piazza Dante, Pia... reconstruction, sfm, landmark, model, geometry link 2013-03-12 1186
84 Aachen Retrieval The Aachen dataset consists of 4479 images taken with multiple cameras (3GB), 369 query images taken with the camera of a mobile phone together with their SIFT ... retrieval, aachen, landmark, sfm, reconstruction link 2013-03-11 947
83 Ikonos Aerial Since its launch in September 1999, Space Imaging IKONOS earth imaging satellite has provided a reliable stream of image data that has become the standard for c... reconstruction, sfm, urban, aerial link 2013-03-11 965
82 Zurich City Hall Zurich City Hall dataset (also CIPA dataset) nformation: Place: City Hall, Zurich, Switzerland Number of Images: 15, 1280 x 1000 pixels Camera: Fuji DS 30... reconstruction, sfm, urban, zurich link 2017-07-17 900
74 PMVS 3D Photography The following are multiview stereo data sets captured in our lab: a set of images, camera parameters and extracted apparent contours of a single rigid object. E... sfm, reconstruction, depth, dense, mesh link 2017-01-31 1163
73 Strecha Dense MVS An evaluation benchmark for dense MVS for these datasets fountain-P11, Herz-Jesu-P8, entry-P10, castle-P19, Herz-Jesu-P25, castle-P30 . Images (corrected for... sfm, reconstruction, benchmark, depth, dense, mesh link 2017-12-11 1575
72 Acute3D Aiguille du Midi MVS Aiguille du Midi. France showing photographs with Camera: Mamiya ZD. 55mm. - Resolution: 5Mpixels, 53 images - Photographer: B. Vallet (Imagine/EVD - 2006) ... sfm, reconstruction, mesh, large scale, outdoor link 2013-03-21 931
68 The KITTI Vision Benchmark Suite We take advantage of our autonomous driving platform Annieway to develop novel challenging real-world computer vision benchmarks. Our tasks of interest are: ste... stereo, depth, flow, detection tracking, reconstruction, sfm, odometry, segmentation, semantic car depth link 2017-11-26 1361
67 Middlebury MVS Dino The object is a plaster dinosaur (stegosaurus). Click on thumbnail for a full-sized (640x480) image. Resolution of ground truth model: 0.00025m (you may wish to... sfm, reconstruction, benchmark, multiview, 3d, link 2013-09-20 939
66 Middlebury MVS Temple The object is a plaster reproduction of Temple of the Dioskouroi in Agrigento, Sicily. Click on thumbnail for a full-sized (640x480) image. Resolution of ground... sfm, reconstruction, benchmark, multiview, 3d, link 2013-09-20 811
63 Paris500k The Paris500k dataset consists of 501,356 geotagged images collected from Flickr and Panoramio. The dataset was collected from a geographic bounding box rather ... retrieval, paris, landmark, geotag, flickr, panoramio, sfm, reconstruction link 2016-12-23 1186
54 Notre Dame The Notre Dame de Paris dataset used for 3D SfM reconstruction and contains 715 images provided by Noah Snavely. There are also version for NotreDame by Mic... limited, flickr, landmark, sfm, paris, frontview, reconstruction, 3d, pointcloud link 2015-06-19 986
53 DTU Robot The DTU Robot dataset consists of color images of 60 scenes acquired in a controlled setup from 119 different positions and under different lighting. For each s... feature, detection, description, matching, sfm, reconstruction, illumination link 2016-05-15 883
49 PhotoTourism Pair The data is taken from Photo Tourism reconstructions from Trevi Fountain (Rome), Notre Dame (Paris) and Half Dome (Yosemite). Each dataset consists of a series ... feature, matching, description, pair, sfm link 2013-03-11 918
39 Leuven Stereo Scene The Leuven Stereo Scene dataset is a scene and depth dataset. There exist two variants of this dataset - a CVPR 2007 paper [1] by Leibe et al. for detection and... segmentation, semantic, reconstruction, urban, sfm, 3d, leuven, depth, stereo link 2013-11-03 1817
34 CamVid The Cambridge-driving Labeled Video Database (CamVid) dataset from Gabriel Brostow [?] contains ten minutes of video footage and corresponding semantically labe... sfm, depth, semantic, segmentation, urban link 2016-04-18 2898


total views: 32690 5 queries in 0.00012683868408203s 0.00013113021850586s 0.00017881393432617s 2.0027160644531E-5s 0.0010569095611572s and total 0.0071730613708496s