Yet Another Computer Vision Index To Datasets (YACVID)

This website provides a list of frequently used computer vision datasets. Wait, there is more!
There is also a description containing common problems, pitfalls and characteristics and now a searchable TAG cloud.
Plus, this is open for crowd editing (if you pass the ultimate turing test)! - Questions? yacvid [at] hayko [dot] at

Content, Design and Idea © by Hayko Riemenschneider, 2011-2016. Texts and Images are subject of copyright by the respective authors.

Hey! If you're reading this, why not help and update the description of the dataset you're working on?

Add a new dataset

2d   3d   4d   aachen   abdomen   abrupt   accelerometer   action   activity   address   adhead   adjustment   aerial   aesthetic   aesthetics   age   aic   aircraft   airplane   airport   amazon   ambiguous   analysis   and   anger   animal   animation   annotation   anomaly   apartment   api   appearance   applelogo   architecture   articulated   aspect   attention   attribute   attributes   authentication   automatic   autonomous   avoid   axis   babyface   background   balance   baseline   behavior   belgium   benchmark   benchmarking   bike   bilateral   binary   biology   biometric   biometry   blender   blur   body   bone   bottle   boundingbox   brand   bremen   buffy   building   bullseye   bundle   bunny   byu   cad   calibration   caltech   camera   canada   captioning   capture   car   cardinal   categorization   category   celebrity   cell   centered   chair   challenge   challenges;   change   chemistry   chest   chromaticity   church   circle   cities   city   classification   clothing   clustering   clutter   cnn   co-segmentation   coco   code   codebook   coffee   color   community   comparison   conditions   constancy   context   contour   cooking   copyright   cosegmentation   counting   cover   cow   crepe   cross-view   crowd   ct   cutting   dance   database;   dataset   dataset;   day   decomposition   deep   defocus   deformation   dense   depth   description   descriptor   detail   detection   detection;   dichromatic   disgust   disparity   dogs   domain   driving   dubrovnik   duplicate   dynamic   ear   ecocentric   edge   egocentric   ellipses   emotion   estimation   evaluation   event   expression   eye   facade   face   facial   fear   feature   field   fine-grained   fingerprints   fingertip   first-person   fish   fisheye   fitting   flickr   flight   floorplan   flow   fly   flying   food   foot   foreground   foreground;   fov   frames   frontview   fundus   gait   game   gender   genetic   genome   geography   geometry   geotag   germany   gesture   getry   gif   giraffe   gis   global   google   gps   grammar   graphics   graz   ground   ground-truth;   groundtruth   group   hand   hands   handwritten   hd   head   heart   heat   hierarchy   high-definition   highlight   highway   holes   horse   human   identification   illumination   image   imagenet   images   imdb   indoor   inertial   initialization   inserts   instance   intake   interaction   interactive   interest   internet   invariance   ir   isar   joy   kernels   keyframe   kimia   kinect   label   labeling   laboratory   landmark   lane   language   large   large-scale   laser   lattice   layout   learning   letter   leuven   lidar   light   lightfield   lighting   limited   line   lisbon   liver   local   localization   location   logo   lowlevel   machine   manhattan   map   mask   match   matching   material   medial   medical   medicine   memorability   mesh   milling   mirror   mobile   model   modeling   modelling   monitoring   mono   montage   motion   motion-capture-data   motorbike   mouse   movement   movie   movies   moving   mpeg   mug   multi-camera;   multi-class   multi-mode   multi-sensor;   multi-view   multilabel   multiple   multitarget   multiview   naming   natural   nature   navigation   network   neutral   newyork   night   noise   normal   nude   number   object   objects   occlusion   ocr   odometry   omnidirection   omnidirectional   open-view   operation   optical   optimization   organ   original   osnabrueck   outdoor   overhead   overlap   oxford   pair   pairwise   panorama   panoramio   parallel   paris   parsing   part   partial   pasadena   pascal   patch   path   pattern   pedestrian   people   person   perspective   phase   photogrammetry   physics   pittsburgh   place   plane   planning   point   pointcloud   popularity   pornography   pose   pose;   presentation   pressure   primitive   procedural   profile   proposal   ptz   quality   radar   randomnoise   rank   ranking   ransac   rate   ratio   re-identification   real   realism   recipe   recognition   recognition;   reconstruction   rectification   rectified   reflection   registration   regular   reidentification   remote   removal   rendering   repetition   resolution   retina   retinal   retrieval   rgb   rgbd   rgbd;   road   robot   robust   rome   room   rotation   sad   saliency   sampling   sanfrancisco   satellite   scale   scan   scanner   scene   scenes   search   segmentation   semantic   sense   sensing   sequence   sfm   shadow   shadows   shape   shapes   sheffield   shoes   shots   shutter   sideview   sign   similarity   simultaneous   single   singletarget   singleview   skeleton   sketch   skin   sky   slam   soccer   social   software   source   space   spain   sphere   sport   stability   stabilization   static   stationary   stereo   stereovision   stochastic   street   streetside   streetview   structure   structure-from-motion   structured   structures   study   stuff   stylization   subpixel   subtraction;   summarization   summary   superresolution   supervised   surface   surgery   surprise   surveillance   swan   switzerland   symmetry   synthetic   table   target   taxonomy   temporal   text   texture   texture-less   therapy   thermal   things   time   time-series   tiny   tool   tools   top-view   tracking   traffic   trajectory   transfer   transportation   triangulation   truth   tuberculosis   type   uas   ultrasound   understanding   uneven   unmanned   unsupervised   urban   user   vanishing   variation   vehicle   vehicles   video   video2gif   videos   videosurveillance   view   viewpoint   vision   visual   volleyball   vt   water   weakly   wear   wearable   weather   webcam   white   wide   wikipedia   wild   workflow   world   xray   year   zoom   zurich  
«showing 562 tags of 562 total tags for 398 datasets (1.41) »

DID Name Description Tags URL Date Views
395 AWS Public Datasets AWS hosts a variety of public datasets that anyone can access for free. Previously, large datasets such as satellite imagery or genomic data have required hour... amazon classification deep learning segmentation recognition satellite human biology space image resolution link 2017-07-28 33
386 Utrecht University, ShakeFive2 ShakeFive2 A collection of 8 dyadic human interactions with accompanying skeleton metadata. The metadata is frame based xml data containing the skeleton join... human interaction Kinect video link 2017-06-26 43
355 IMPART multi-modal/multi-view The multi-modal/multi-view datasets are created in a cooperation between University of Surrey and Double Negative within the EU FP7 IMPART project. The sourc... multi-view multi-mode video rgbd lidar 3d model color indoor outdoor dynamic action face human emotion link 2017-01-01 259
354 Facial Expression Research Group Database (FERG-DB), University of Washington, Seattle FERG-DB is a database of stylized characters with annotated facial expressions. The database contains multiple face images of six stylized characters. The chara... Face, Facial expression, Animation, Stylization, annotation emotion, deep learning, anger, sad, joy, disgust, surprise, neutral, fear, cardinal classification, human transfer, image retrieval link 2017-02-27 360
340 Ljubljana CVL Face Database Database contains 798 images of 114 persons, with 7 images per person and is freely available for research purposes. All images were taken in supervised conditi... face pedestrian person recognition biometry human illumination lighting link 2017-02-22 300
339 Annotated Web Ears Dataset (AWE Dataset) Dataset contains 1000 images of 100 persons, with 10 images per person and is freely available. All images were acquired by cropping ears from images from the i... ear biometry person pedestrian recognition human lighting link 2017-02-16 284
337 WIDER Attribute Dataset WIDER ATTRIBUTE dataset is a human attribute recognition benchmark dataset, of which images are selected from the publicly available WIDER dataset. There are a ... Attribute recognition, Human attribute link 2016-09-22 402
327 PIROPO Database: People in Indoor ROoms with Perspective and Omnidirectional cameras The PIROPO database (People in Indoor ROoms with Perspective and Omnidirectional cameras) comprises multiple sequences recorded in two different indoor rooms, u... people surveillance perspective omnidirectional fisheye indoor room detection human link 2017-02-16 619
308 TST Intake Monitoring dataBase t is composed of food intake movements, recorded with Kinect V1 (320240 depth frame resolution), simulated by 35 volunteers for a total of 48 tests. The device... human food intake monitoring behavior kinect pointcloud tracking age groundtruth link 2016-02-11 430
305 SPHERE human skeleton movements The SPHERE human skeleton movements dataset was created using a Kinect camera, that measures distances and provides a depth map of the scene instead of the clas... human action behavior motion movement video skeleton depth kinect link 2016-03-24 656
299 CAMP-TUM: Multiple Human Pose Estimation from Multiple Views We introduce the Shelf dataset for multiple human pose estimation from multiple views. In addition we annotate the body joints in the Campus dataset from CVLAB@... 3D human pose estimation multiple view motion capture link 2015-07-15 557
294 Happy People Images Database Group emotion recognition in images - Happiness Intensity labels for group of people in images. The images have been collected from Flickr using keyword search ... group, facial expression, emotion, wild, human, flickr, behavior link 2015-07-13 665
289 ETHZ CVL Clust MICCAI 2015 Challenge on Liver Ultrasound Tracking Munich, October 9, 2015 (Full Day) Outline Ultrasound (US) imaging is a widely used medical imaging techn... medical liver tracking ultrasound therapy human organ benchmark real link 2015-06-19 471
288 Berkeley Urban Street tracking The UrbanStreet dataset used in the paper can be downloaded here [188M] . It contains 18 stereo sequences of pedestrians taken from a stereo rig mounted on a ca... tracking detection segmentation multitarget recognition video pedestrian urban human link 2015-07-14 1051
286 HDA Person Dataset - ISR Lisbon The High Definition Analytics (HDA) dataset is a multi-camera High-Resolution image sequence dataset for research on High-Definition surveillance: Pedestrian De... Video Surveillance Pedestrian Detection Re-Identification Multiview Tracking Benchmark Indoor High-Definition Camera Network lisbon human link 2017-01-26 1475
276 TST TUG (Timed Up and Go) The TUG (Timed Up and Go test) dataset consists of actions performed three times by 20 volunteers. The people involved in the test are aged between 22 and 39, w... action recognition time kinect wearable accelerometer human video link 2015-05-02 507
275 TST fall detection It is composed of ADL (activity daily living) and fall actions simulated by 11 volunteers. The people involved in the test are aged between 22 and 39, with diff... action recognition detection depth kinect wearable accelerometer human video link 2017-03-14 860
272 Stanford 40 Actions The Stanford 40 Actions dataset contains images of humans performing 40 actions. In each image, we provide a bounding box of the person who is performing the ac... human action recognition detection boundingbox link 2015-06-19 798
265 Salient Montages: Human-centric Video Summarization The Salient Montages is a human-centric video summarization dataset from the paper [1]. In [1], we present a novel method to generate salient montages from u... video summarization montage saliency wearable human link 2015-05-02 588
264 Domain-specific Personal Videos Highlight Dataset The domain-specific personal videos highlight dataset from the paper [1] describes a fully automatic method to train domain-specific highlight ranker for raw p... video summarization saliency wearable human action recognition domain link 2015-05-02 654
263 Crowd Dataset The crowd datasets are collected from a variety of sources, such as UCF and data-driven crowd datasets. The sequences are diverse, representing dense crowd in t... crowd video detection anomaly scene understanding human pedestrian link 2016-11-05 1366
261 MPI Multi-View Collection GVV datasets Welcome to the homepage of the gvvperfcapeva datasets. This site serves as a hub to access a wide range of datasets that have been created for projects of the G... video multiview tracking face mesh reconstruction depth human action pose link 2014-12-10 667
257 FaceScrub The FaceScrub dataset comprises a total of 107818 unconstrained face images of 530 celebrities crawled from the Internet, with about 200 images per person. M... face detection recognition celebrity people human link 2014-11-24 819
255 Robotic 3D Scan Repository The Robotic 3D Scan Repository from Osnabrueck contains 23 different datasets showing a veriaty of 3D scans for objects, humans, cities, university campus, heat... 3d reconstruction scan laser heat urban city human aerial germany bremen lidar osnabrueck link 2015-04-10 696
254 ChokePoint Dataset We collected a video dataset, termed ChokePoint, designed for experiments in person identification/verification under real-world surveillance conditions using e... human pedestrian identification recognition multiview sequence face detection real world surveillance clustering link 2015-05-02 1049
247 PASCAL VOC Parts The PASCAL VOC is augmented with segmentation annotation for semantic parts of objects. For example, for the person category, we provide segmentation mask for 2... detection recognition pascal object part pedestrian human segmentation semantic link 2014-09-30 1141
245 ETHZ CVL Video SumMe The Video Summarization (SumMe) dataset consists of 25 videos, each annotated with at least 15 human summaries (390 in total). The data consists of videos, anno... video summary benchmark human groundtruth action event link 2016-10-21 1181
235 Kindergarten Video Surveillance The dataset consist of the about 50 hours obtained from kindergarten surveillance videos. Dataset, totally approximately 100 videos sequences (1000GB, 50 hours)... human action behavior segmentation video background surveillance link 2015-10-08 1151
232 Pratheepan Human Skin Detection Dataset The images in this dataset are downloaded randomly from Google for human skin detection research. It has been used in the paper: W.R. Tan, C.S. Chan, Y. Prathee... skin detection, skin segmentation, human detection, skin dataset link 2017-05-03 1968
227 Omnidirectional and panoramic image dataset We share our omnidirectional and panoramic image dataset (with annotations) to be used for human and car detection. Please reach through: panorama detection car omnidirection human recognition link 2017-01-13 1161
219 JPL First-Person Interaction JPL First-Person Interaction dataset (JPL-Interaction dataset) is composed of human activity videos taken from a first-person viewpoint. The dataset particularl... video action recognition interactive motion human link 2014-02-03 609
213 ChairGest Gestures ChairGest is an open challenge / benchmark. The task consists in spotting and recognizing gestures from multiple synchronized sensors: 1 Kinect and 4 Xsens Ine... benchmark recognition kinect gesture detection human link 2014-06-06 608
212 Polo Instance Segmentation The Polo instance segmentation dataset is a semantic segmentation task for Hough transform based segmentation masks. It consists of supervised segmentation for ... semantic segmentation horse human outdoor mask scene understanding n/a 2016-01-21 873
207 CASIA Gait Recognition Dataset Dataset A (former NLPR Gait Database) was created on Dec. 10, 2001, including 20 persons. Each person has 12 image sequences, 4 sequences for each of the three ... gait recognition biometry action classification motion human foot pressure link 2017-03-10 2119
192 Our Database of Faces The Our Database of Faces (ORL) dataset contains ten different images of each of 40 distinct subjects. For some subjects, the images were taken at different tim... face recognition illumination human expression link 2013-09-23 873
173 MuHAVi and MAS human action The Multicamera Human Action Video Data (MuHAVi) Manually Annotated Silhouette Data (MAS) are two datasets consisting of selected action sequences for the eval... human action behavior segmentation video background link 2017-07-25 1452
171 CHALEARN Multi-modal Gesture Challenge The CHALEARN Multi-modal Gesture Challenge is a dataset +700 sequences for gesture recognition using images, kinect depth, segmentation and skeleton data. ht... gesture, kinect, recognition, human, action, illumination, depth, segmentation, skeleton link 2013-08-09 765
170 Sheffield Kinect Gesture (SKIG) dataset The Sheffield Kinect Gesture (SKIG) dataset contains 2160 hand gesture sequences (1080 RGB sequences and 1080 depth sequences) collected from 6 subjects. ... gesture, kinect, recognition, human, action, illumination, depth link 2013-08-09 866
153 MSRC Kinect Gesture Dataset The Microsoft Research Cambridge-12 Kinect gesture dataset consists of sequences of human movements, represented as body-part locations, and the associated gest... gesture, kinect, recognition, human, action link 2013-08-08 896
138 Buffy The Buffy dataset contains images selected from the TV series, Buffy: the Vampire Slayer. We select a set of 452 images from the first two episodes for training... segmentation, detection, buffy, movie, human link 2015-02-07 677
16 PETS 2009 The PETS 2009 dataset contains 3 parts showing multi-view sequences containing pedestrians walking in an outdoor environment. The parts are used for person coun... frontview, outdoor, pedestrian, detection, tracking, overlap, occlusion multitarget, human link 2015-06-19 1282
14 INRIA People The INRIA People dataset from Navneet Dalal and Bill Triggs [DalalCVPR2005] consists of training and testing data. The training contains 1805 images and X peopl... detection, pedestrian, sideview, frontview, human, boundingbox link 2015-06-19 1202

total views: 34433 5 queries in 0.00013208389282227s 0.00013303756713867s 0.00015711784362793s 0.00011301040649414s 0.0012021064758301s and total 0.0069260597229004s