Yet Another Computer Vision Index To Datasets (YACVID)

This website provides a list of frequently used computer vision datasets. Wait, there is more!
There is also a description containing common problems, pitfalls and characteristics and now a searchable TAG cloud.
Plus, this is open for crowd editing (if you pass the ultimate turing test)! - Questions? yacvid [at] hayko [dot] at

Content, Design and Idea © by Hayko Riemenschneider, 2011-2016. Texts and Images are subject of copyright by the respective authors.

Hey! If you're reading this, why not help and update the description of the dataset you're working on?

Add a new dataset



2d   3d   4d   aachen   abdomen   abrupt   accelerometer   action   actions   activities   activity   address   adhead   adjustment   aerial   aesthetics   age   aircraft   airplane   airport   alignment   amazon   ambiguous   analysis   and   anger   animal   animation   annotation   anomaly   apartment   api   appearance   applelogo   architecture   articulation   aspect   attention   attribute   attributes   authentication   automatic   autonomous   avoid   axis   babyface   background   balance   baseline   behavior   belgium   benchmark   benchmarking   bike   bilateral   binary   biology   biometric   biometry   blender   blur   boat   body   bone   bottle   boundingbox   brand   bremen   buffy   building   bullseye   bundle   bunny   byu   cad   calibration   california   caltech   camera   canada   captioning   captions   capture   car   cardinal   categorization   category   celebrity   cell   centered   chair   challenge   change   chemistry   chest   chromaticity   church   circle   cities   city   classification   clothing   clustering   clutter   cnn   co-segmentation   co-skeletonization   coco   code   codebook   coffee   color   community   comparison   computer   conditions   constancy   context   contour   cooking   copyright   cosegmentation   counting   cover   cow   crepe   cross-view   crowd   ct   cutting   daily   dance   data   dataset   day   daylight   decomposition   deep   defocus   deformation   dense   depth   description   descriptor   detail   detection   dichromatic   disgust   disparity   dogs   domain   dped   driving   drone   dubrovnik   duplicate   dynamic   ear   edge   egocentric   ellipse   emotion   endtoend   enhancement   estimation   evaluation   event   expression   eye   facade   face   facial   fashion   fear   feature   field   fine-grained   fingerprint   fingertip   first-person   fish   fisheye   fitting   flickr   flight   floorplan   flow   fly   flying   food   foot   footprint   foreground   fov   frames   frontview   fundus   gait   game   gan   gaze   gender   genetic   genome   geography   geometry   geotag   geotagged   germany   gesture   getry   gif   giraffe   gis   global   google   gps   grammar   graphics   graz   ground   groundtruth   group   gsd   hand   handwritten   hd   head   heart   heat   hierarchy   high-definition   highlight   highway   holes   horse   house   human   humans   identification   illumination   image   imagenet   images   imdb   indoor   inertial   initialization   inserts   instance   intake   interaction   interactive   interest   internet   invariance   ir   isar   joy   kernels   keyframe   kimia   kinect   label   labeling   laboratory   land   landmark   lane   language   large   large-scale   laser   lattice   layout   learning   letter   leuven   lidar   light   lightfield   lighting   limited   line   lip   lisbon   liver   local   localization   location   logo   lowlevel   machine   manhattan   map   maritime   mask   match   matching   material   medial   medical   medicine   memorability   mesh   metadata   milling   mirror   mobile   model   modeling   modelling   monitoring   mono   montage   motion   motion-capture-data   motorbike   mouse   mouth   movement   movie   mpeg   mug   multi-camera   multi-class   multi-human   multi-mode   multi-sensor   multi-spectral   multi-view   multilabel   multimodal   multiple   multitarget   multiview   naming   natural   nature   navigation   network   neutral   newyork   night   nir   noise   normal   nude   number   object   objects   occlusion   ocr   odometry   omnidirection   omnidirectional   open-view   operation   optical   optimization   organ   original   osnabrueck   outdoor   overhead   overlap   oxford   pair   pairwise   pan   panorama   panoramio   parallel   paris   parsing   part   partial   pasadena   pascal   patch   path   pattern   pedestrian   people   person   perspective   phase   photo   photogrammetry   physics   pittsburgh   place   plane   planning   point   pointcloud   polygon   popularity   pornography   pose   presentation   pressure   primitive   privacy   procedural   profile   proposal   ptz   quality   question   radar   random   rank   ranking   ransac   rate   ratio   re-identification   reading   real   realism   recipe   recognition   reconstruction   rectification   rectified   reflection   registration   regression   regular   reidentification   remote   removal   rendering   repetition   resolution   retina   retinal   retrieval   rgb   rgb-d   rgbd   road   robot   robust   rome   room   ros   rotation   sad   saliency   sampling   sanfrancisco   satellite   scale   scan   scanner   scene   scenes   search   segmentation   selfdriving   semantic   sense   sensing   sequence   sfm   shadow   shadows   shape   sheffield   shoes   shots   shutter   sideview   sign   similarity   simultaneous   single   singleview   skeleton   skeletonization   sketch   skin   sky   slam   soccer   social   software   source   space   spain   spanish   speaker   speech   sphere   sport   stability   stabilization   static   stationary   stereo   stereovision   stochastic   street   structure   structured   study   stuff   stylization   subpixel   subtraction   summarization   summary   superpixel   superresolution   supervised   surface   surgery   surprise   surveillance   swan   switzerland   sydney   symmetry   synthetic   table   target   taxonomy   temporal   text   texture   texture-less   therapy   thermal   things   time   time-series   tiny   tool   tools   top-view   tracking   traffic   trajectory   transfer   transportation   triangulation   truth   tuberculosis   type   uas   uav   udacity   ultrasound   understanding   uneven   unmanned   unsupervised   urban   user   vanishing   variation   vehicle   vessel   video   view   viewpoint   visible   vision   visual   volleyball   vqa   vt   water   wavelength   weakly   wear   wearable   weather   webcam   white   wide   wikipedia   wild   workflow   world   xray   year   zoom   zurich  
«showing 591 tags of 591 total tags for 421 datasets (1.4) »


object
DID Name Description Tags URL Date Views
385 WildLife Documentary (WLD) Dataset The dataset contains 15 documentary films that are downloaded from YouTube, whose durations vary from 9 minutes to as long as 50 minutes, and the total number o... Video object detection link 2017-06-23 135
384 An RGB-D Dataset for 6D Pose Estimation of Texture-less Objects (T-LESS) A dataset acquired with 3 synchronized sensors (Primesense Carmine 1.09, Microsoft Kinect v2, Canon IXUS 950 IS), featuring: * 30 industry-relevant objects:... RGBD 3D pose texture-less object estimation link 2017-09-12 140
376 ScanNet ScanNet is an RGB-D video dataset containing 2.5 million views in more than 1500 scans, annotated with 3D camera poses, surface reconstructions, and instance-le... scene indoor synthetic cad room layout rendering realism 3d segmentation object recognition link 2017-05-12 226
375 SUNCG: Indoor Scenes The SUNCG dataset is a Large 3D Model Repository for Indoor Scenes. SUNCG is an ongoing effort to establish a richly-annotated, large-scale dataset of 3D s... scene indoor synthetic room layout rendering realism 3d segmentation object recognition link 2017-05-02 211
373 DAVIS: Densely Annotated VIdeo Segmentation We present the 2017 DAVIS Challenge, a public competition specifically designed for the task of video object segmentation. Following the footsteps of other succ... object tracking segmentation video benchmark code hd quality resolution link 2017-08-03 209
372 VOT2016 segmentation The VOT2016 pixel-wise annotations dataset contains pixel-wise per-frame annotations for sequences from VOT2016 dataset. The annotation is in a form of BW image... object tracking segmentation mask annotation visual link 2017-04-17 219
346 LASIESTA (Labeled and Annotated Sequences for Integral Evaluation of SegmenTation Algorithms) LASIESTA is composed by many real indoor and outdoor sequences organized in different categories, each of one covering a specific challenge in moving object det... dataset groundtruth motion object detection foreground background subtraction challenge stationary camera link 2017-09-12 388
342 ICS-FORTH + Modelling of 2D Shapes with Ellipses The dataset contains more than 4,536 2D shapes included in standard as well as in home-build datasets. Our goal is to represent a given 2D shape with an au... shape ellipse fitting modelling 2d object classification link 2017-11-28 286
312 University of León - Edge profile milling head tool data set This data set comprises 144 images of an edge profile cutting head of a milling machine. The head tool contains a total of 30 cutting inserts. The cutting head ... milling head tool inserts localization object cutting tools edge profile tool wear monitoring link 2015-11-06 500
298 Freiburg-Berkeley Motion Segmentation The Freiburg-Berkeley Motion Segmentation Dataset (FBMS-59) is an extension of the BMS dataset with 33 additional video sequences. A total of 720 frames is anno... video segmentation benchmark object tracking pedestrian groundtruth motion link 2017-03-21 980
296 Video Segmentation Benchmark The Video Segmentation Benchmark (VSB100) provides ground truth annotations for the Berkeley Video Dataset, which consists of 100 HD quality videos divided into... video segmentation benchmark object tracking pedestrian groundtruth motion link 2017-03-21 1046
287 INRIA Lafarge Benchmarks Some datasets and evaluation tools are provided on this page for four different computer vision and computer graphics problems. Population counting Line-ne... 3d surface reconstruction groundtruth pointcloud object detection line road network urban crowd pedestrian counting link 2015-06-18 905
284 TRANCOS Overlapping Car Crowds The TRaffic ANd COngestionS (TRANCOS) dataset, a novel benchmark for (extremely overlapping) vehicle counting in traffic congestion situations. It consists of 1... object detection car transportation vehicle highway urban spain traffic link 2015-06-16 994
278 Comprehensive Cars (CompCars) The Comprehensive Cars (CompCars) dataset contains data from two scenarios, including images from web-nature and surveillance-nature. The web-nature data contai... car vehicle recognition attribute classification fine-grained urban object link 2017-12-07 1537
271 Labeling in 3D Scenes This dataset package contains the software and data used for Detection-based Object Labeling on the RGB-D Scenes Dataset as implemented in the paper: Detecti... 3d kinect reconstruction indoor depth object recognition link 2015-03-16 783
270 B3DO: Berkeley 3D Object Dataset For the first few decades of the fields existence, computer vision has been focused on algorithmic, logical approaches to perception. But it was only with the a... 3d kinect reconstruction indoor depth object recognition link 2015-03-16 687
268 HUJI Multi-illuminant Image Sequences dataset The Multi-illuminant Image Sequences dataset contains 16 video sequences (13 with single light source and 3 with two global light sources), recorded with a HD ... illumination nature physics dichromatic light chromaticity color constancy white balance object link 2015-02-20 649
258 Visual Attributes dataset The Visual Attributes dataset contains visual attribute annotations for over 500 object classes (animate and inanimate) which are all represented in ImageNet. E... classification recognition attribute imagenet object link 2016-10-02 946
248 VIDEO datasets overview Many different labeled video datasets have been collected over the past few years, but it is hard to compare them at a glance. So we have created a handy spread... video benchmark recognition classification detection object action link 2014-09-30 1140
247 PASCAL VOC Parts The PASCAL VOC is augmented with segmentation annotation for semantic parts of objects. For example, for the person category, we provide segmentation mask for 2... detection recognition pascal object part pedestrian human segmentation semantic link 2014-09-30 1258
246 Bristol Egocentric Object Interactions Dataset The BEOID dataset includes object interactions ranging from preparing a coffee to operating a weight lifting machine and opening a door. The dataset is recorded... video interaction object egocentric pose 3d tracking link 2017-09-12 1065
240 Microsoft COCO The Microsoft COCO (mscoco) is an image recognition and segmentation dataset which contains more 300k images for more than 70 categories. Other features: Mo... object context segmentation detection recognition benchmark semantic link 2015-05-02 1490
217 Youtube-Objects dataset The YouTube-Objects dataset is composed of videos collected from YouTube by querying for the names of 10 object classes. It contains between 9 and 24 videos for... video object detection segmentation flow optical link 2014-02-03 1059
204 UCF Person and Car VideoSeg The UCF Person and Car VideoSeg dataset consists of six videos with groundtruth for video object segmentation. Surfing, jumping, skiing, sliding, big car, sm... video segmentation object motion model camera groundtruth link 2015-04-19 1066
203 GaTech VideoSeg The GaTech VideoSeg dataset consists of two (waterski and yunakim?) video sequences for object segmentation. There exists no groundtruth segmentation annotat... video segmentation object motion model camera link 2013-10-09 973
202 GaTech SegTrack The SegTrack dataset consists of six videos (five are used) with ground truth pixelwise segmentation (6th penguin is not usable). The dataset is used for accura... video segmentation object proposal flow optical motion model camera stationary groundtruth link 2013-10-09 885
199 THUR15000 We introduce a labeled dataset of categorized images for evaluating sketch based image retrieval. Using Flickr, we downloaded about 3000 images for each of the ... group saliency object detection visual attention sketch shape retrieval internet link 2013-10-08 971
198 THUS10000 The THUS10000 benchmark dataset comprises of 10,000 images, each of which has an unambiguous salient object and the object region is accurately annotated with p... segmentation saliency object detection visual attention link 2015-01-11 1115
191 Daimler Mono Pedestrian Classification Benchmark The Daimler Mono Pedestrian Classification Benchmark dataset consists of two parts: a base data set. The base data set contains a total of 4000 pedestrian- a... pedestrian classification outdoor urban object scale illumination link 2013-09-18 869
190 Daimler Mono Pedestrian Detection Benchmark The Daimler Mono Pedestrian Detection Benchmark dataset contains a large training and test set. The training set contains 15.560 pedestrian samples (image cut-o... pedestrian detection outdoor urban mono scale object link 2013-09-18 989
189 Farman Institute 3D Point Sets The Farman Institute 3D Point Sets dataset contains 11 objects by a 3D laser scanner. This dataset was peer-reviewed by Image Processing On Line: Farman Institu... 3d laser scanner object reconstruction model point link 2013-09-18 758
188 KTH Multiview Football The KTH Multiview Football dataset contains 771 images of football players includes images taken from 3 views at 257 time instances 14 annotated body joints. ... multiview pedestrian tracking detection object camera outdoor game soccer pose recognition multitarget link 2016-09-18 1430
187 Aspect Layout dataset The Aspect Layout dataset is designed to allow evaluation of object detection for aspect ratios in perspective images. Author text: In this project we see... detection object aspect ratio perspective layout link 2013-09-06 645
166 ICG Multi-Camera Datasets The ICG Multi-Camera datasets consist of Easy Data Set (just one person) Medium Data Set (3-5 persons, used for the experiments) Hard Data Set (crowded sc... multiview pedestrian tracking detection object camera calibration graz indoor video multitarget link 2015-06-19 1208
165 ICG Multi-Camera and Virtual PTZ The ICG Multi-Camera and Virtual PTZ dataset contains the video streams and calibrations of several static Axis P1347 cameras and one panoramic video from a sph... multiview pedestrian tracking detection object camera calibration graz network video panorama crowd outdoor multitarget link 2017-08-19 1314
164 ICG Lab 6 (Multi-Camera Multi-Object Tracking) The ICG Lab 6 (Multi-Camera Multi-Object Tracking) dataset contains 6 indoor people tracking scenarios recorded at our laboratory using 4 static Axis P1347 came... multiview pedestrian tracking detection object laboratory camera calibration evaluation segmentation graz link 2017-12-05 1816
147 FlickrLogos-32 The FlickrLogos-32 dataset contains photos showing brand logos and is meant for the evaluation of multi-class logo recognition as well as logo retrieval methods... flickr, logo, detection, retrieval, image, object recognition, machine learning, classification brand boundingbox link 2017-11-14 1233
137 Synthetic CAD models The Synthetic CAD Models dataset consists of X synthetic CAD models for detection (planar) primitives. Efficient RANSAC for Point-Cloud Shape Detection Ruwe... model, ransac, 3d object, reconstruction, primitive, synthetic link 2013-08-08 869
114 TUD Shapes 1+2 This material is supplementary to Michael Stark, Bernt Schiele. How Good are Local Features for Classes of Geometric Objects. Eleventh IEEE International C... shape object classification tool binary link 2013-08-08 772
101 CIFAR-10 / 100 The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images. ... classification, tiny, color, patch, scene, object link 2013-08-08 854
51 PN Learning PN Learning - How does TLD work? Tracking estimates the object location as long as the object is visible. During tracking all observed patterns of the object... single target tracking learning object pedestrian bike face link 2017-11-28 741
50 Babenko tracking The Babenko tracking dataset contains 12 video sequences for single object tracking. For each clip they provide (1) a directory with the original image s... tracking single object animal face occlusion video link 2016-08-08 2390
44 UK Bench The UK Bench dataset from Henrik Stewenius and David Nister contains 10200 images of N=2550 groups with each four images at size 640x480. The images are rotated... retrieval image object centered rotation link 2017-08-31 2021
28 CMU Faces - Frontal faces The MIT + CMU frontal face dataset from H. Rowley contains 130 images with 507 labeled frontal faces from movie, portrait and media sources. It is mostly graysc... frontview, face, detection object boundingbox link 2015-06-19 874
20 CALTECH 256 The CALTECH 256 dataset by Li Fei-Fei contains 30607 images for 256 categories.... classification centered object scene image link 2013-08-08 846
19 CALTECH 101 The CALTECH 101 dataset by Li Fei-Fei contains images for 101 categories with about 40 to 800 images per category. Most categories have about 50 images at rough... classification centered object scene image link 2013-08-08 866


total views: 42358 5 queries in 0.00013208389282227s 0.00013303756713867s 0.00015878677368164s 8.7976455688477E-5s 0.0012111663818359s and total 0.0070171356201172s