Yet Another Computer Vision Index To Datasets (YACVID)

This website provides a list of frequently used computer vision datasets. Wait, there is more!
There is also a description containing common problems, pitfalls and characteristics and now a searchable TAG cloud.
Plus, this is open for crowd editing (if you pass the ultimate turing test)! - Questions? yacvid [at] hayko [dot] at

Content, Design and Idea © by Hayko Riemenschneider, 2011-2016. Texts and Images are subject of copyright by the respective authors.

Hey! If you're reading this, why not help and update the description of the dataset you're working on?

Add a new dataset

2d   3d   4d   aachen   abdomen   abrupt   accelerometer   action   activity   address   adhead   adjustment   aerial   aesthetic   aesthetics   age   aic   aircraft   airplane   airport   ambiguous   analysis   and   anger   animal   animation   annotation   anomaly   apartment   api   appearance   applelogo   architecture   articulated   aspect   attention   attribute   attributes   authentication   autonomous   avoid   axis   babyface   background   balance   baseline   behavior   belgium   benchmark   benchmarking   bike   bilateral   binary   biology   biometric   biometry   blender   body   bone   bottle   boundingbox   brand   bremen   buffy   building   bullseye   bundle   bunny   byu   calibration   caltech   camera   canada   captioning   capture   car   cardinal   categorization   category   celebrity   cell   centered   chair   challenges;   change   chest   chromaticity   church   circle   cities   city   classification   clustering   clutter   cnn   co-segmentation   coco   code   codebook   coffee   color   community   comparison   conditions   constancy   context   contour   copyright   cosegmentation   counting   cover   cow   cross-view   crowd   ct   cutting   database;   dataset   dataset;   day   decomposition   deep   deformation   dense   depth   description   descriptor   detail   detection   detection;   dichromatic   disgust   disparity   dogs   domain   driving   dubrovnik   duplicate   dynamic   ear   ecocentric   edge   egocentric   ellipses   emotion   estimation   evaluation   event   expression   eye   facade   face   facial   fear   feature   field   fine-grained   fingerprints   fingertip   first-person   fish   fisheye   fitting   flickr   flight   floorplan   flow   fly   flying   food   foot   foreground   foreground;   fov   frames   frontview   fundus   gait   game   genetic   genome   geography   geometry   geotag   germany   gesture   getry   gif   giraffe   gis   global   google   gps   grammar   graphics   graz   ground   ground-truth;   groundtruth   group   hand   hands   handwritten   head   heart   heat   hierarchy   high-definition   highlight   highway   holes   horse   human   identification   illumination   image   imagenet   images   imdb   indoor   inertial   initialization   inserts   instance   intake   interaction   interactive   interest   internet   invariance   ir   isar   joy   keyframe   kimia   kinect   label   labeling   laboratory   landmark   lane   language   large   laser   lattice   layout   learning   letter   leuven   lidar   light   lightfield   lighting   limited   line   lisbon   liver   local   localization   location   logo   lowlevel   machine   manhattan   map   mask   match   matching   material   medial   medical   memorability   mesh   milling   mirror   mobile   model   modeling   modelling   monitoring   mono   montage   motion   motion-capture-data   motorbike   mouse   movement   movie   movies   moving   mpeg   mug   multi-camera;   multi-class   multi-mode   multi-sensor;   multi-view   multilabel   multiple   multitarget   multiview   naming   natural   nature   navigation   network   neutral   newyork   night   noise   nude   number   object   objects   occlusion   ocr   odometry   omnidirection   omnidirectional   open-view   optical   optimization   organ   osnabrueck   outdoor   overhead   overlap   oxford   pair   pairwise   panorama   panoramio   paris   parsing   part   partial   pasadena   pascal   patch   path   pedestrian   people   person   perspective   photogrammetry   physics   pittsburgh   place   plane   planning   point   pointcloud   popularity   pornography   pose   presentation   pressure   primitive   procedural   profile   proposal   ptz   quality   radar   randomnoise   rank   ranking   ransac   rate   ratio   re-identification   real   recognition   recognition;   reconstruction   rectification   rectified   reflection   registration   regular   remote   removal   repetition   retina   retinal   retrieval   rgb   rgbd   road   robot   robust   rome   room   rotation   sad   saliency   sampling   sanfrancisco   scale   scan   scanner   scene   scenes   search   segmentation   semantic   sense   sensing   sequence   sfm   shadow   shadows   shape   shapes   sheffield   shoes   shots   shutter   sideview   sign   similarity   single   singletarget   singleview   skeleton   sketch   skin   sky   slam   soccer   social   software   source   spain   sphere   sport   stability   stabilization   static   stationary   stereo   stereovision   stochastic   street   streetside   streetview   structure   structure-from-motion   structures   study   stuff   stylization   subpixel   subtraction;   summarization   summary   superresolution   supervised   surface   surprise   surveillance   swan   switzerland   symmetry   synthetic   target   taxonomy   temporal   text   texture   therapy   thermal   things   time   time-series   tiny   tool   tools   tracking   traffic   trajectory   transfer   transportation   triangulation   truth   tuberculosis   type   uas   ultrasound   understanding   uneven   unmanned   unsupervised   urban   user   vanishing   variation   vehicle   vehicles   video   video2gif   videos   videosurveillance   view   viewpoint   vision   visual   volleyball   vt   water   weakly   wear   wearable   weather   webcam   white   wide   wikipedia   wild   world   xray   year   zoom   zurich  
«showing 524 tags of 524 total tags for 372 datasets (1.41) »

DID Name Description Tags URL Date Views
81 Zurich Hoengg Zurich Hoengg (Switzerland) is an aerial dataset. The dataset consists of 4 aerial images in colour (Figures 2-5), scanned with 14 microns, the format is Ti... aerial, semantic, segmentation, outdoor link 2013-03-11 638
82 Zurich City Hall Zurich City Hall dataset (also CIPA dataset) nformation: Place: City Hall, Zurich, Switzerland Number of Images: 15, 1280 x 1000 pixels Camera: Fuji DS 30... reconstruction, sfm, urban, zurich link 2013-03-11 689
280 Yahoo Flickr Creative Commons 100M Yahoo Flickr Creative Commons 100M (YFCC100M) dataset contains a list of photos and videos. This list is compiled from data available on Yahoo! Flickr. All the ... flickr landmark image recognition detection reconstruction 3d clustering social community internet link 2015-09-24 656
154 WordNet WordNet is a large lexical database of English. Nouns, verbs, adjectives and adverbs are grouped into sets of cognitive synonyms (synsets), each expressing a di... language, hierarchy, imagenet, classification link 2013-08-07 531
314 WIDER FACE: A Face Detection Benchmark WIDER FACE dataset is a large-scale face detection benchmark dataset with 32,203 images and 393,703 face annotations, which have high degree of variabilities in... face detection scale pose occlusion link 2016-02-11 638
337 WIDER Attribute Dataset WIDER ATTRIBUTE dataset is a human attribute recognition benchmark dataset, of which images are selected from the publicly available WIDER dataset. There are a ... Attribute recognition, Human attribute link 2016-09-22 250
261 MPI Multi-View Collection GVV datasets Welcome to the homepage of the gvvperfcapeva datasets. This site serves as a hub to access a wide range of datasets that have been created for projects of the G... video multiview tracking face mesh reconstruction depth human action pose link 2014-12-10 553
233 PASCAL Context We would like to announce the release of PASCAL-Context dataset. We augmented PASCAL VOC 2010 dataset with annotations for 400+ additional categories. In the cu... semantic segmentation pascal benchmark category recognition dense shape link 2014-07-17 709
68 The KITTI Vision Benchmark Suite We take advantage of our autonomous driving platform Annieway to develop novel challenging real-world computer vision benchmarks. Our tasks of interest are: ste... stereo, depth, flow, detection tracking, reconstruction, sfm, odometry, segmentation, semantic car depth link 2014-02-10 1041
227 Omnidirectional and panoramic image dataset We share our omnidirectional and panoramic image dataset (with annotations) to be used for human and car detection. Please reach through: panorama detection car omnidirection human recognition link 2017-01-13 997
107 BIWI Pedestrians We provide the three datasets used for testing our system for our ICCV 2007 publication, including annotations. Data was recorded using a pair of AVT Marlins mo... detection, tracking, pedestrian, trajectory, crowd, overlap, occlusion link 2013-03-12 909
373 DAVIS: Densely Annotated VIdeo Segmentation We present the 2017 DAVIS Challenge, a public competition specifically designed for the task of video object segmentation. Following the footsteps of other succ... object tracking segmentation video benchmark code hd quality resolution link 2017-04-23 9
330 Cityscapes We present a new large-scale dataset that contains a diverse set of stereo video sequences recorded in street scenes from 50 different cities, with high quality... stereo video urban cities semantic segmentation detection car person pedestrian weakly link 2016-07-19 729
299 CAMP-TUM: Multiple Human Pose Estimation from Multiple Views We introduce the Shelf dataset for multiple human pose estimation from multiple views. In addition we annotate the body joints in the Campus dataset from CVLAB@... 3D human pose estimation multiple view motion capture link 2015-07-15 443
199 THUR15000 We introduce a labeled dataset of categorized images for evaluating sketch based image retrieval. Using Flickr, we downloaded about 3000 images for each of the ... group saliency object detection visual attention sketch shape retrieval internet link 2013-10-08 746
109 EITZ Sketch-Based Image Retrieval We introduce a benchmark for evaluating the performance of large scale sketch-based image retrieval systems. The necessary data is acquired in a controlled user... shape, matching, retrieval, partial, sketch link 2014-02-11 552
361 KAIST Multispectral Pedestrian Detection Benchmark We developed imaging hardware consisting of a color camera, a thermal camera and a beam splitter to capture the aligned multispectral (RGB color + Thermal) imag... pedestrian, thermal, RGB link 2015-11-09 127
254 ChokePoint Dataset We collected a video dataset, termed ChokePoint, designed for experiments in person identification/verification under real-world surveillance conditions using e... human pedestrian identification recognition multiview sequence face detection real world surveillance clustering link 2015-05-02 918
328 UT Zappos50K UT Zappos50K (UT-Zap50K) is a large shoe dataset consisting of 50,025 catalog images collected from The images are divided into 4 major categories -... fine-grained, ranking, local learning, pairwise comparison, shoes, attributes link 2016-06-23 238
112 SHREC Unlike the previous SHREC contests, the objective of this SHREC 2012 contest is to evaluate the performance of 3D-mesh segmentation techniques instead of evalua... segmentation, mesh, part, 3d link 2013-07-29 510
333 UBC3V Dataset UBC3V is a synthetic dataset for training and evaluation of single or multiview depth-based pose estimation techniques. The nature of the data is similar to the... depth segmentation pose link 2016-08-18 254
65 Middlebury Stereo Two-frame stereo depth estimation... flow, depth, stereo link 2014-03-06 642
111 Grabcut To evaluate our method we designed a new ground truth database of 50 images. The following zip-files contain: Data, Segmentation, Labelling - Lasso, Labelling -... segmentation, boundingbox, color, optimization, background link 2015-06-19 517
306 Shadow Removal Dataset and Online Benchmark for Variable Scene Categories (University of Bath, Bath) To encourage the open comparison of single image shadow removal in community, we provide an online benchmark site and a dataset. Our quantitatively verified hig... shadow removal benchmark illumination singleview link 2016-02-11 447
292 Mobile Phone and Webcam Hand Images for Personal Authentication and Identification This work attempts to provide two Hand Images Databases for hand biometrics: one is created using a mobile phone camera of modest quality, which we called mob... mobile webcam hand authentication Identification person biometric shape segmentation link 2015-11-09 454
331 EuRoC MAV Dataset This web page presents visual-inertial datasets collected on-board a Micro Aerial Vehicle (MAV). The datasets contain stereo images, synchronized IMU measuremen... aerial vehicles, indoor, global shutter, slam link 2016-07-18 401
24 UIUC Cars This UIUC Cars dataset by Shivani Agarwal, Aatif Awan and Dan Roth contains images of side views of cars for use in evaluating object detection algorithms. The ... car, sideview, detection, scale, recognition, urban, scale link 2013-10-08 862
311 ASL Datasets Repository This site is dedicated to provide datasets for the Robotics community with the aim to facilitate result evaluations and comparisons. The datasets presented on t... laser 3d urban nature city link 2015-10-28 338
121 Oakland 3D This repository contains labeled 3-D point cloud laser data collected from a moving platform in a urban environment. Data are provided for research purposes. ... reconstruction, sfm, urban, semantic, segmentation, laser link 2014-06-10 789
99 BSDS500 This new dataset is an extension of the BSDS300, where the original 300 images are used for training / validation and 200 fresh images, together with human anno... segmentation, edge, contour, detection link 2013-03-12 706
114 TUD Shapes 1+2 This material is supplementary to Michael Stark, Bernt Schiele. How Good are Local Features for Classes of Geometric Objects. Eleventh IEEE International C... shape object classification tool binary link 2013-08-08 587
304 Ian Dworkin (McMaster University) This is the database of biological images (from the genetics model system, Drosophila melanogaster, a fruit fly) across multiple levels of variation. we have... biology genetic variation fly animal classification link 2016-02-11 354
214 The Webcam Clip Art Dataset This is a subset of the dataset introduced in the SIGGRAPH Asia 2009 paper, Webcam Clip Art: Appearance and Illuminant Transfer from Time-lapse Sequences. As... webcam light illumination camera video static change urban nature time link 2014-02-01 577
30 ICG Graz50 This is a dataset of rectified facade images and semantic labels. The goal of the annotation is to study the layout of the facades. It contains 50 images of... segmentation, semantic, procedural, reconstruction, urban, graz link 2014-01-28 667
371 ICS-FORTH MHAD101 Action Co-segmentation This is a custom generated dataset designed for the task of action co-segmentation in pairs of action sequences. The dataset contains 101 pairs of action se... action cosegmentation, temporal segmentation, motion-capture-data, time-series link 2017-04-05 35
89 Corel Photo Gallery This image database is a part of the "Corel Gallery Magic" (commercial product). It contains 80000 images divided into 800 categories of 100 images. These image... semantic, segmentation, outdoor n/a 2017-01-19 588
251 ETHZ CVL RueMonge 2014 This ETHZ CVL RueMonge 2014 dataset used for 3D reconstruction and semantic mesh labelling for urban scene understanding. It was first published in [1] and p... semantic segmentation 3d reconstruction architecture paris benchmark source code urban recognition classification outdoor pointcloud mesh link 2014-11-24 1057
200 Landmark 3D This dataset provides a collection of web images and 3D models for research on landmark recognition (especially for methods based on 3D models). We hope it coul... landmark recognition classification retrieval 3d reconstruction codebook matching feature flickr link 2016-08-09 886
271 Labeling in 3D Scenes This dataset package contains the software and data used for Detection-based Object Labeling on the RGB-D Scenes Dataset as implemented in the paper: Detecti... 3d kinect reconstruction indoor depth object recognition link 2015-03-16 572
151 People in WBCN This dataset is for people tracking in wide baseline camera networks and was designed as a contest at ICPR 2012. The contest consists of two challenges: ... detection, tracking, pedestrian, trajectory, crowd, overlap, occlusion, aerial link 2013-08-02 1048
349 HKUST Ambiguity Dataset This dataset contains two image collections, TempleOfHeaven and SportsArena, that are deemed hard for Structure-from-Motion (SfM). The method is described i... Structure-from-Motion, Ambiguous structures link 2016-11-07 141
252 Volleyball Activity Dataset 2014 This dataset contains 7 challenging volleyball activity classes annotated in 6 videos from professionals in the Austrian Volley League (season 2011/12). A total... action activity sport volleyball detection recognition video analysis link 2014-10-23 1128
71 University of Tsukuba Stereo Flow This dataset contains 1800 stereo pairs with ground truth disparity maps, occlusion maps and discontinuity maps that will help to further develop the state of t... flow, depth, stereo, graphics, synthetic, optical, link 2013-08-08 1111
256 Multi-Task Facial Landmark (MTFL) dataset This dataset contains 12,995 face images which are annotated with (1) five facial landmarks, (2) attributes of gender, smiling, wearing glasses, and head pose. ... face, landmark detection, deep learning, cnn, attribute link 2015-11-07 1088
367 NUS Multi-Sensor Presentation (NUSMSP) Dataset This dataset consist 51 oral presentation recorded with 2 ambient visual sensor (web-cam), 3 First Person View (FPV) cameras (1 on presenter and 2 on randomly c... multi-sensor; presentation analysis link 2017-03-22 37
338 MIT LaMem: Large-Scale Image Memorability Dataset This database contains 60,000 images with memorability scores. The images come from a variety of datasets including SUN, COCO, image popularity, AVA, and severa... memorability aesthetics objects scenes popularity link 2016-11-24 232
312 University of León - Edge profile milling head tool data set This data set comprises 144 images of an edge profile cutting head of a milling machine. The head tool contains a total of 30 cutting inserts. The cutting head ... milling head tool inserts localization object cutting tools edge profile tool wear monitoring link 2015-11-06 285
92 ICDAR 2011 This challenge is set up around three tasks: Text Localisation, Text Segmentation and Word Recognition. Participation in any or all tasks is welcome. Check the ... text, detection, recognition, classification link 2016-06-01 630
105 MSR 3D Video These sequences were used for our video interpolation work described in High-quality video view interpolation using a layered representation, C.L. Zitnick, ... reconstruction, camera, segmentation, depth link 2013-03-12 707
43 ZuBud The Zurich Building dataset (ZuBud) from Hao Shao, Tomas Svoboda and Luc Van Gool [?] contains 1005 images with 201 buildings each in five views. There is also ... retrieval, urban, procedural, rectification link 2013-03-11 654
217 Youtube-Objects dataset The YouTube-Objects dataset is composed of videos collected from YouTube by querying for the names of 10 object classes. It contains between 9 and 24 videos for... video object detection segmentation flow optical link 2014-02-03 808
195 Yotta The Yotta dataset consists of 70 images for semantic labeling given in 11 classes. It also contains multiple videos and camera matrices for 14km or driving. ... semantic segmentation urban video camera 3d reconstruction classification link 2013-09-30 721
117 YorkUrbanDB The York Urban Line Segment Database is a compilation of 102 images (45 indoor, 57 outdoor) of urban environments consisting mostly of scenes from the campus of... vanishing, point, pose, urban, reconstruction, outdoor, geometry, manhattan link 2013-09-18 569
29 The Yale Face The Yale Face dataset from A. Georghiades contains 5760 single light source images of ten subjects, each shown in 9 poses and 64 illumination setups (leading to... face, pedestrian, detection, pose, illumination link 2015-06-23 609
344 YACCLAB dataset The YACCLAB dataset includes both synthetic and real binary images and is suitable for a wide range of applications, ranging from document processing to survail... Labeling Binary Text Medical Fingerprints VideoSurveillance Natural RandomNoise link 2017-01-20 190
300 CMP WxBS dataset The Wide (multiple) Baseline Dataset. 31 image pairs, simultaneously combining several nuisance factors: geometry, illumination, IR-visible, etc. WxBS: Wide ... feature detection description matching viewpoint IR day night link 2015-07-15 531
279 WWW Crowd The Where Who Why (WWW) dataset provides 10,000 videos with over 8 million frames from 8,257 diverse scenes, therefore offering a superior comprehensive dataset... surveillance crowd pedestrian detection recognition flow optical video link 2015-05-27 760
290 UWO GCO Volume Segmentation The Western GCO Segmentation problem instances are provided to compare effects of graph size, neighborhood size, length of s to t paths, regional arc consistenc... medical liver babyface bone abdomen adhead face segmentation binary optimization link 2015-06-19 375
40 Weizmann Action The Weizmann actions dataset by Blank, Gorelick, Shechtman, Irani, and Basri consists of ten different types of actions: bending, jumping jack, jumping, jump in... video, segmentation, action, classification link 2015-07-14 603
321 Webcam Interestingness The Webcam Interestingness dataset consists of 20 different webcam streams, with 159 images each. It is annotated with interestingness ground truth, acquired in... webcam interest classification retrieval ranking video weather link 2016-03-02 340
215 WILD -Weather and Illumination Database The Weather and Illumination Database (WILD) is an extensive database of high quality images of an outdoor urban scene, acquired every hour over all seasons. It... webcam light illumination camera video static change urban time depth estimation weather newyork link 2016-04-19 948
26 We Are Family Stickmen The We Are Family Stickmen dataset from Eichner and Ferrari contains X images with X people in group photos for human pose estimation with annotated 2D human bo... pose, pedestrian, body part link 2013-03-11 658
178 VSUMM (Video SUMMarization) The VSUMM (Video SUMMarization) dataset is of 50 videos from Open Video. All videos are in MPEG-1 format (30 fps, 352 x 240 pixels), in color and with sound. Th... video summary type user study keyframe static similarity link 2015-11-13 844
372 VOT2016 segmentation The VOT2016 pixel-wise annotations dataset contains pixel-wise per-frame annotations for sequences from VOT2016 dataset. The annotation is in a form of BW image... object tracking segmentation mask annotation visual link 2017-04-17 10
258 Visual Attributes dataset The Visual Attributes dataset contains visual attribute annotations for over 500 object classes (animate and inanimate) which are all represented in ImageNet. E... classification recognition attribute imagenet object link 2016-10-02 687
218 VidPairs The VidPairs dataset contains 133 pairs of images, taken from 1080p HD (~2 megapixel) official movie trailers. Each pair consists of images of the same scene wi... video pair matching patch description flow dense optical link 2015-06-19 579
363 ETH/Yahoo Video2Gif dataset The Video2GIF dataset contains over 100,000 pairs of GIFs and their source videos. The GIFs were collected from two popular GIF websites (, gifsoup.... video2gif highlight video summarization gif summary scene understanding link 2017-02-22 76
245 ETHZ CVL Video SumMe The Video Summarization (SumMe) dataset consists of 25 videos, each annotated with at least 15 human summaries (390 in total). The data consists of videos, anno... video summary benchmark human groundtruth action event link 2016-10-21 993
296 Video Segmentation Benchmark The Video Segmentation Benchmark (VSB100) provides ground truth annotations for the Berkeley Video Dataset, which consists of 100 HD quality videos divided into... video segmentation benchmark object tracking pedestrian groundtruth motion link 2017-03-21 778
288 Berkeley Urban Street tracking The UrbanStreet dataset used in the paper can be downloaded here [188M] . It contains 18 stereo sequences of pedestrians taken from a stereo rig mounted on a ca... tracking detection segmentation multitarget recognition video pedestrian urban human link 2015-07-14 887
323 UT Egocentric (UT Ego) Dataset The Univ. of Texas at Austin Egocentric (UT Ego) Dataset contains 4 videos captured from head-mounted cameras. Each video is about 3-5 hours long, captured in ... First-person vision, egocentric link 2016-03-17 295
234 UMD Dynamic Scene Recognition The UMD Dynamic Scene Recognition dataset consists of 13 classes and 10 videos per class and is used to classify dynamic scenes. The dataset has been describ... scene recognition classification dynamic video motion link 2017-01-05 770
44 UK Bench The UK Bench dataset from Henrik Stewenius and David Nister contains 10200 images of N=2550 groups with each four images at size 640x480. The images are rotated... retrieval image object centered rotation link 2016-07-22 1258
204 UCF Person and Car VideoSeg The UCF Person and Car VideoSeg dataset consists of six videos with groundtruth for video object segmentation. Surfing, jumping, skiing, sliding, big car, sm... video segmentation object motion model camera groundtruth link 2015-04-19 841
274 UBO 2014 Materials The UBO 2014 consists of 7 semantic categories. Each of these 7 material categories contains measurements of 12 different material instances for being capable t... material light illumination texture classification recognition link 2015-03-27 423
276 TST TUG (Timed Up and Go) The TUG (Timed Up and Go test) dataset consists of actions performed three times by 20 volunteers. The people involved in the test are aged between 22 and 39, w... action recognition time kinect wearable accelerometer human video link 2015-05-02 413
12 TUD Pedestrians training The TUD Pedestrians training dataset from Micha Andriluka, Stefan Roth and Bernt Schiele consists of 210 and 400 training images with X pedestrians with signifi... segmentation, pedestrian, sideview link 2013-03-11 1132
10 TUD Pedestrians The TUD Pedestrians dataset from Micha Andriluka, Stefan Roth and Bernt Schiele [AndrilukaCVPR2008] consists of 250 images with 311 fully visible people with si... segmentation, pedestrian, sideview link 2015-05-26 1268
17 TUD Motorbike The TUD Motorbike dataset from Bastian Leibe contains 115 images collected from the internet. Each image contains one or more motorbikes at different scales and... motorbike, detection, pascal link 2013-08-08 789
9 TUD Crossing tracking The TUD Crossing dataset from Micha Andriluka, Stefan Roth and Bernt Schiele consists of 201 images with 1008 highly overlapping pedestrians with significant va... tracking detection segmentation multitarget pedestrian sideview overlap urban link 2015-06-19 1426
11 TUD Campus The TUD Campus dataset from Micha Andriluka, Stefan Roth and Bernt Schiele consists of 71 images and 303 highly overlapping pedestrians with large scale changes... segmentation, pedestrian, sideview, overlap link 2013-03-11 974
347 MOCAT (TUB Multi-Object and Multi-Camera Tracking Dataset) The TU Berlin Multi-Object and Multi-Camera Tracking Dataset (MOCAT) is a synthetic dataset to train and test tracking and detection systems in a virtual world.... synthetic tracking detection multi-class multi-view evaluation pedestrian vehicle animal link 2016-11-02 254
210 Traffic Video dataset The Traffic Video dataset consists of X video of an overhead camera showing a street crossing with multiple traffic scenarios. The dataset can be downloaded ... urban traffic tracking detection overhead view road video link 2014-02-03 2217
284 TRANCOS Overlapping Car Crowds The TRaffic ANd COngestionS (TRANCOS) dataset, a novel benchmark for (extremely overlapping) vehicle counting in traffic congestion situations. It consists of 1... object detection car transportation vehicle highway urban spain traffic link 2015-06-16 754
8 Tools2D The Tools 2D dataset from Bronstein, Bronstein, Bruckstein, and Kimmel [?] for partial similarity experiments and consists of 15 shapes: 5 humans, 5 horses and ... shape, binary, matching, retrieval, partial link 2014-02-11 851
102 Tiny Images The Tiny Images dataset consists of 79,302,017 images, each being a 32x32 color image. This data is stored in the form of large binary files which can be accese... classification, tiny, color, retrieval link 2013-03-12 581
198 THUS10000 The THUS10000 benchmark dataset comprises of 10,000 images, each of which has an unambiguous salient object and the object region is accurately annotated with p... segmentation saliency object detection visual attention link 2015-01-11 879
177 SIPI textures The Textures volume currently contains 154 images, all monochrome, 129 512x512 and 25 1024x1024. For the Brodatz texture images, the number in parenthesis (i... texture, segmentation, classification, benchmark, synthetic, evaluation link 2013-08-20 722
167 Text and Vision (TVGraz) Dataset The Text and Vision (TVGraz) dataset is an annotated multi-modal dataset which currently contains 10 visual object categories, 4030 images and associated text. ... text appearance classification evaluation link 2017-01-10 818
185 Kung-Fu fighter Multi-View The test sequences provide interested researchers a real-world multi-view test data set captured in the blue-c portals. The data is meant to be used for testing... multiview tracking segmentation camera action link 2013-10-08 713
137 Synthetic CAD models The Synthetic CAD Models dataset consists of X synthetic CAD models for detection (planar) primitives. Efficient RANSAC for Point-Cloud Shape Detection Ruwe... model, ransac, 3d object, reconstruction, primitive, synthetic link 2013-08-08 645
209 Symmetry Set The Symmetry set dataset is a collection of images at different illuminations for the purpose of image matching using local symmetry features. Image Matching... symmetry matching feature image illumination lighting urban building link 2013-11-05 703
186 Symmetry Facades The Symmetry Facades dataset contains 9 building facades with multiple images. It used for coupled symmetry and structure from motion detection. Coupled Str... symmetry facade building urban reconstruction sfm 3d repetition link 2013-09-05 887
122 Symmetric Bundle Adjustment The Symmetric Bundle Adjustment dataset contains four sequences of the CAB building, Barcelona, Redmond and Capitole for 3D reconstruction considering symmetrie... reconstruction, sfm, urban, bundle adjustment, symmetry link 2013-03-12 700
93 Street View Text The Street View Text (SVT) dataset contains 647 words and 3796 letters in 249 images harvested from Google Street View. The dataset is more challenging becaus... text, detection, recognition, classification, outdoor, urban link 2014-01-13 830
242 Stanford Dogs Dataset The Stanford Dogs dataset contains images of 120 breeds of dogs from around the world. This dataset has been built using images and annotation from ImageNet for... classification, detection, fine-grained categorization, dogs link 2015-07-29 1053
197 Stanford Background Dataset The Stanford Background Dataset is a new dataset introduced in Gould et al. (ICCV 2009) for evaluating methods for geometric and semantic scene understanding. T... semantic segmentation urban classification nature geometry link 2016-01-21 1258
272 Stanford 40 Actions The Stanford 40 Actions dataset contains images of humans performing 40 actions. In each image, we provide a bounding box of the person who is performing the ac... human action recognition detection boundingbox link 2015-06-19 671
128 The Stanford 3D Scanning Repository The Stanford 3D Scanning Repository dataset is a compilation of 3D scans of objects like Stanford Bunny, Happy Buddha, Dragon, Armadillo and Lucy. These contain... reconstruction, laser, bunny, triangulation link 2013-03-21 907
127 Stable Structure from Motion The Stable Structure from Motion datasets due to size limitations cannot put the images online. Instead here are the tracked image points and the final reconstr... sfm, reconstruction, geometry, stability, robust, 3d, landmark, church link 2013-08-08 1003
305 SPHERE human skeleton movements The SPHERE human skeleton movements dataset was created using a Kinect camera, that measures distances and provides a depth map of the scene instead of the clas... human action behavior motion movement video skeleton depth kinect link 2016-03-24 526
100 Sowerby The Sowerby dataset contains 105 images for semantic segmentation.... semantic, segmentation, outdoor n/a 2014-09-26 715
6 SIID The SIID silhouette dataset contains... and is from the Shape Indexing of Image Database (SIID). Download SIID silhouette dataset shape, binary, matching, retrieval link 2017-03-02 908
170 Sheffield Kinect Gesture (SKIG) dataset The Sheffield Kinect Gesture (SKIG) dataset contains 2160 hand gesture sequences (1080 RGB sequences and 1080 depth sequences) collected from 6 subjects. ... gesture, kinect, recognition, human, action, illumination, depth link 2013-08-09 734
202 GaTech SegTrack The SegTrack dataset consists of six videos (five are used) with ground truth pixelwise segmentation (6th penguin is not usable). The dataset is used for accura... video segmentation object proposal flow optical motion model camera stationary groundtruth link 2013-10-09 661
320 San Francisco Landmark Dataset for Mobile Landmark Recognition The San Francisco Landmark Dataset for Mobile Landmark Recognition is a set of images and query images for localization. We present the San Francisco Landmar... retrieval localization city urban sanfrancisco landmark calibration gps mobile link 2016-03-04 420
120 Samantha The SAMANTHA (Structure-and-Motion Pipeline on a Hierarchical Cluster Tree) dataset contains 4 sequences for 3D reconstruction: Pozzoveggiani, Piazza Dante, Pia... reconstruction, sfm, landmark, model, geometry link 2013-03-12 964
265 Salient Montages: Human-centric Video Summarization The Salient Montages is a human-centric video summarization dataset from the paper [1]. In [1], we present a novel method to generate salient montages from u... video summarization montage saliency wearable human link 2015-05-02 473
255 Robotic 3D Scan Repository The Robotic 3D Scan Repository from Osnabrueck contains 23 different datasets showing a veriaty of 3D scans for objects, humans, cities, university campus, heat... 3d reconstruction scan laser heat urban city human aerial germany bremen lidar osnabrueck link 2015-04-10 586
140 RGB-D Person Re-identification The RGB-D Person Re-identification dataset is for person re-identification using depth information. The main motivation is that the standard techniques (such as... identification, classification, shape, depth, pedestrian, 3d link 2014-10-08 840
295 Rent3D The Rent3D dataset comprises floorplans and images. The goal of this work is to enable a 3D virtual-tour of an apartment given a small set of monocular images o... indoor building reconstruction layout floorplan apartment urban link 2015-07-13 412
135 Quad 6K The Quad 6K dataset is a Structure-from-Motion dataset taken at Arts Quad at Cornell University campus and consists of 6514 images with ground truth positions o... reconstruction, sfm, urban, groundtruth, landmark, 3d gps link 2013-11-05 862
169 QMUL Junction Dataset The QMUL Junction dataset is a busy traffic scenario for research on activity analysis and behavior understanding. Video length: 1 hour (90000 frames) Fra... detection tracking crowd counting pedestrian video motion behavior link 2016-12-06 1067
60 PSU HUB The PSU HUB dataset is a detection, tracking dataset. Ground truth trajectory and grouping information for pedestrians walking in the PSU student union building... detection, tracking, pedestrian, trajectory, crowd, overlap, occlusion link 2013-07-19 797
336 Procedural texture perceptual similarity The procedural texture perceptual similarity dataset contains a list of procedural textures along with their pairwise distances, as defined by a perceptual stud... texture procedural benchmark study link 2016-09-21 121
55 Prague Texture Segmentation The Prague Texture Segmentation Datagenerator and Benchmark is designed to mutually compare and rank different (dynamic/static) texture segmenters (supervised o... texture, segmentation, classification, benchmark, synthetic link 2013-08-08 560
370 Pornography Dataset (NPDI/DCC/UFMG) The Pornography database contains nearly 80 hours of 400 pornographic and 400 non-pornographic videos. For the pornographic class, we have browsed websites whic... pornography, videos, video shots, video frames link 2017-03-31 82
212 Polo Instance Segmentation The Polo instance segmentation dataset is a semantic segmentation task for Hough transform based segmentation masks. It consists of supervised segmentation for ... semantic segmentation horse human outdoor mask scene understanding n/a 2016-01-21 651
174 Pittsburgh Fast-food Image dataset The Pittsburgh Fast-food Image dataset (PFID) consists of 4545 still images, 606 stereo pairs, 3033600 videos for structure from motion, and 27 privacy-preservi... food recognition classification reconstruction video laboratory real link 2016-12-21 1321
327 PIROPO Database: People in Indoor ROoms with Perspective and Omnidirectional cameras The PIROPO database (People in Indoor ROoms with Perspective and Omnidirectional cameras) comprises multiple sequences recorded in two different indoor rooms, u... people surveillance perspective omnidirectional fisheye indoor room detection human link 2017-02-16 448
133 Aesthetic The dataset contains 20,278 images with properties to aesthetics, emotion, and image quality. The only differences are that it is a larger dataset, an... aesthetics, quality, emotion link 2015-05-01 1031
16 PETS 2009 The PETS 2009 dataset contains 3 parts showing multi-view sequences containing pedestrians walking in an outdoor environment. The parts are used for person coun... frontview, outdoor, pedestrian, detection, tracking, overlap, occlusion multitarget, human link 2015-06-19 1128
15 PETS 2006 The PETS 2006 dataset contains 7 parts showing multi-sensor sequences containing left-luggage scenarios with increasing scene complexity at a train station scen... frontview, indoor, pedestrian, detection, tracking, multitarget link 2015-08-12 887
162 ICG PRID 2011 The Person Re-ID (PRID) 2011 dataset was created in co-operation with the Austrian Institute of Technology for the purpose of testing person re-identification a... pedestrian classification identification multiview trajectory illumination appearance change graz link 2013-10-08 809
244 Pedestrian Parsing on Surveillance Scenes (PPSS) dataset The Pedestrian Parsing dataset contains 3,673 images from 171 videos of different Surveillance Scenes (PPSS), where 2,064 images are occluded and 1,609 are not.... Pedestrian, Parsing, Segmentation link 2017-03-21 1278
247 PASCAL VOC Parts The PASCAL VOC is augmented with segmentation annotation for semantic parts of objects. For example, for the person category, we provide segmentation mask for 2... detection recognition pascal object part pedestrian human segmentation semantic link 2014-09-30 993
25 PASCAL VOCs The PASCAL VOC Challenge datasets by Mark Everingham is a yearly dataset which has a central evaluation server and the final test data is not released. The late... detection segmentation pose pedestrian chair animal car building airplane link 2017-03-09 820
63 Paris500k The Paris500k dataset consists of 501,356 geotagged images collected from Flickr and Panoramio. The dataset was collected from a geographic bounding box rather ... retrieval, paris, landmark, geotag, flickr, panoramio, sfm, reconstruction link 2016-12-23 830
46 Paris Retrieval The Paris dataset consists of 6412 images. Images have high resolution and are in JPEG format. retrieval, urban, paris, landmark link 2016-10-11 614
266 Paris Art Deco Facades The Paris Art Deco Facades dataset consists of 79 / 80 images of rectified facades of the architectural style Art Deco, which has different sizes of windows, de... paris semantic segmentation recognition architecture facade urban city procedural grammar link 2015-01-20 500
356 The Oxford RobotCar Dataset The Oxford RobotCar Dataset contains over 100 repetitions of a consistent route through Oxford, UK, captured over a period of over a year. The dataset captures ... car robot driving autonomous street urban video recognition detection classification segmentation time year link 2017-01-04 201
45 Oxford Buildings The Oxford Buildings dataset by James Philbin and Andrew Zisserman consists of 5062 images collected from Flickr by searching for particular Oxford landmarks. T... retrieval, urban, oxford, landmark link 2017-04-17 670
175 Outex texture bench The Outex dataset is part of a framework for empirical evaluation of texture classification and segmentation algorithms. The framework is being constructed acc... texture, segmentation, classification, benchmark, synthetic link 2015-11-17 567
192 Our Database of Faces The Our Database of Faces (ORL) dataset contains ten different images of each of 40 distinct subjects. For some subjects, the images were taken at different tim... face recognition illumination human expression link 2013-09-23 739
66 Middlebury MVS Temple The object is a plaster reproduction of Temple of the Dioskouroi in Agrigento, Sicily. Click on thumbnail for a full-sized (640x480) image. Resolution of ground... sfm, reconstruction, benchmark, multiview, 3d, link 2013-09-20 639
67 Middlebury MVS Dino The object is a plaster dinosaur (stegosaurus). Click on thumbnail for a full-sized (640x480) image. Resolution of ground truth model: 0.00025m (you may wish to... sfm, reconstruction, benchmark, multiview, 3d, link 2013-09-20 748
149 NYU Depth v2 The NYU-Depth V2 data set is comprised of video sequences from a variety of indoor scenes as recorded by both the RGB and Depth cameras from the Microsoft Kinec... semantic segmentation depth kinect label reconstruction link 2013-07-25 1140
148 NYU Depth v1 The NYU-Depth data set is comprised of video sequences from a variety of indoor scenes as recorded by both the RGB and Depth cameras from the Microsoft Kinect. ... semantic segmentation depth kinect label reconstruction link 2014-10-05 745
54 Notre Dame The Notre Dame de Paris dataset used for 3D SfM reconstruction and contains 715 images provided by Noah Snavely. There are also version for NotreDame by Mic... limited, flickr, landmark, sfm, paris, frontview, reconstruction, 3d, pointcloud link 2015-06-19 759
196 New College Data The New College Data Set contains 30GB of data intended for use by the mobile robotics and vision research communities. Our anticipated users are parties intere... odometry urban path 3d reconstruction panorama stereo navigation link 2013-09-30 686
59 Near-Regular Textures The Near-Regular Textures dataset contains textures from completely regular to completely irregular patterns, with a focus on near-regular textures. It also inc... texture, segmentation, classification, symmetry, regular, stochastic link 2013-03-11 593
129 NBVbench The NBVbench is a reference object and benchmark criteria for defining and evaluating the performance of a next best view (NBV) method. ... reconstruction, view, planning, geometry link 2013-04-16 516
7 Mythological Creatures The Mythological Creatures consists of articulated shapes (silhouettes) for partial similarity experiments and contains 15 shapes: 5 humans, 5 horses and 5 cent... shape, binary, matching, retrieval, partial, animal link 2015-06-23 884
173 MuHAVi and MAS human action The Multicamera Human Action Video Data (MuHAVi) Manually Annotated Silhouette Data (MAS) are two datasets consisting of selected action sequences for the eval... human action behavior segmentation video background link 2017-04-04 1194
57 Weizmann Horses The multi-scale Weizmann horses (originally from Eran Borenstein, adapted by Jamie Shotton) consists of 656 images which is split into 50+50training, 50+50 vali... detection, shape, segmentation, clutter, nature, horse link 2013-03-11 863
355 IMPART multi-modal/multi-view The multi-modal/multi-view datasets are created in a cooperation between University of Surrey and Double Negative within the EU FP7 IMPART project. The sourc... multi-view multi-mode video rgbd lidar 3d model color indoor outdoor dynamic action face human emotion link 2017-01-01 133
268 HUJI Multi-illuminant Image Sequences dataset The Multi-illuminant Image Sequences dataset contains 16 video sequences (13 with single light source and 3 with two global light sources), recorded with a HD ... illumination nature physics dichromatic light chromaticity color constancy white balance object link 2015-02-20 472
332 Multi-FoV - Large Field-of-View Cameras for Visual Odometry The Multi-FoV synthetic datasets are two synthetic scenes (vehicle moving in a city, and flying robot hovering in a confined room). For each scene, three differ... visual odometry camera fov synthetic groundtruth blender link 2016-08-11 262
37 MSRC vNIPS The MSRC vNIPS dataset is the MSRC v2 dataset with new annotations for much more accurate segmentations for 93 images. Efficient Inference in Fully Connected... segmentation, semantic, outdoor link 2013-03-11 559
36 MSRC v2 The MSRC v2 dataset is an extension of the MSRC v1 dataset from Microsoft Research in Cambridge. It contains 591 images and 23 object classes with accurate pixe... segmentation, semantic, outdoor link 2016-08-28 1545
35 MSRC v1 The MSRC v1 dataset from Microsoft Research in Cambridge contains 240 images and 9 object classes with coarse pixel-wise labeled images. The dataset is commonl... segmentation, semantic, outdoor link 2016-09-07 1186
183 MSR RGB-D 7-Scenes The MSR RGB-D Dataset 7-Scenes dataset is a collection of tracked RGB-D camera frames. The dataset may be used for evaluation of methods for different applicati... depth video kinect tracking location reconstruction link 2013-09-05 748
184 MSR Learning to Rank The MSR Learning to Rank are two large scale datasets for research on learning to rank: MSLR-WEB30k with more than 30,000 queries and a random sampling of it MS... rank learning sampling search link 2013-09-05 547
182 MSR Action The MSR Action datasets is a collection of various 3D datasets for action recognition. See details video action recognition detection reconstruction 3d link 2013-09-05 726
70 MPI Sintel Flow The MPI Sintel Flow dataset is an optical flow / stereo dataset based on the Blender movie Sintel: The goal of this project was to crea... flow, depth, stereo, graphics, synthetic link 2013-08-08 654
259 MOT Challenge 2D and 3D The MOT Challenge is a framework for the fair evaluation of multiple people tracking algorithms. In this framework we provide: - A large collection of datase... 3d tracking multiple target benchmark dataset people pedestrian surveillance video link 2015-07-31 972
144 MNIST hand-written letters The MNIST database of handwritten digits, available from this page, has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of ... text, classification, letter link 2017-02-06 1403
345 MMSE Heartrate The MMSE heart rate dataset measures the visual heart rate from. faces by throwing darts at people. ... face landmark emotion heart rate biology n/a 2016-10-21 187
28 CMU Faces - Frontal faces The MIT + CMU frontal face dataset from H. Rowley contains 130 images with 507 labeled frontal faces from movie, portrait and media sources. It is mostly graysc... frontview, face, detection object boundingbox link 2015-06-19 662
317 NYU Symmetry Database The mirror symmetry database contains 176 single-symmetry and 63 multyple-symmetry images (.png files) with accompanying ground-truth annotations (.mat files). ... symmetry detection mirror groundtruth link 2016-04-15 290
153 MSRC Kinect Gesture Dataset The Microsoft Research Cambridge-12 Kinect gesture dataset consists of sequences of human movements, represented as body-part locations, and the associated gest... gesture, kinect, recognition, human, action link 2013-08-08 788
240 Microsoft COCO The Microsoft COCO (mscoco) is an image recognition and segmentation dataset which contains more 300k images for more than 70 categories. Other features: Mo... object context segmentation detection recognition benchmark semantic link 2015-05-02 1179
168 Mall Dataset The Mall dataset was collected from a publicly accessible webcam for crowd counting and profiling research. Ground truth: Over 60,000 pedestrians were label... detection tracking crowd counting pedestrian indoor video webcam link 2016-12-06 1276
241 Malaya Abrupt Motion (MAMo) Dataset The Malaya Abrupt Motion (MAMo) dataset is targeted for visual tracking, particularly for abrupt motion tracking. It was collected from publicly accessible data... visual tracking, abrupt motion tracking link 2016-11-05 831
104 Make3D Depth The Make3D Depth dataset s designed to learn features to estimate scene depth from a single image. This dataset contains aligned image and range data: Make3... depth, learning, single view, outdoor, indoor link 2013-03-12 998
163 TUGRAZ ICG Longterm Pedestrian Dataset The Longterm Pedestrian dataset consists of images from a stationary camera running 24 hours for 7 days at about 1 fps. It used for adaptive detection and back... pedestrian change detection background illumination robust indoor coffee graz multitarget link 2015-06-19 795
39 Leuven Stereo Scene The Leuven Stereo Scene dataset is a scene and depth dataset. There exist two variants of this dataset - a CVPR 2007 paper [1] by Leibe et al. for detection and... segmentation, semantic, reconstruction, urban, sfm, 3d, leuven, depth, stereo link 2013-11-03 1411
18 Leeds Cows The Leeds Cows dataset by Derek Magee consists of 14 different video sequences showing a total of 18 cows walking from right to left in front of different backg... detection segmentation cow video background animal link 2013-08-08 701
85 Leaves The Leaves dataset from X contains X images of leaves. Leaves dataset taken by Markus Weber. California Institute of Technology PhD student under Pietro Per... shape, binary, matching, retrieval, partial n/a 2015-12-25 700
208 Landmark 1000 The Landmark 1000 or 1k dataset is a collection of the top 1000 popular flickr landmarks mined from flickr. It is maintained by Noah Snavely and published in... landmark 3d reconstruction pose estimation pointcloud world location link 2013-11-05 781
97 KU Leuven Facade The KU Leuven Facade dataset is used for architectural styles classification. M. Mathias, A. Martinovic, J. Weissenberg, S. Haegler, L. Van Gool: Automatic A... classification, architecture, urban link 2014-11-20 652
188 KTH Multiview Football The KTH Multiview Football dataset contains 771 images of football players includes images taken from 3 views at 257 time instances 14 annotated body joints. ... multiview pedestrian tracking detection object camera outdoor game soccer pose recognition multitarget link 2016-09-18 1062
4 KIMA99 The Kimia 99 has 9 classes each consisting of each 11 images. They are part of the Shape Indexing of Image Database (SIID) project, which also contains the SIID... shape, binary, matching, retrieval, kimia link 2015-07-29 919
3 KIMIA25 The Kimia 25 consists of 6 classes and 25 images. They are part of the Shape Indexing of Image Database (SIID) project, which also contains the SIID silhouette ... shape binary matching retrieval kimia link 2015-08-26 729
5 KIMA216 The Kimia 216 has 18 classes each consisting of 12 images. It contains shapes silhouettes for birds, bones, brick, camels, car, children, classic cards, elephan... shape, binary, matching, retrieval, kimia, animal link 2016-11-09 1204
322 Kendall Square Webcam The Kendall Square webcam dataset consists of two streams for one sunny day and one cloudy day of a city square. It is used for tracking and analyzing color cha... webcam color weather change detection appearance sky link 2016-03-02 426
38 IcgBench The Interactive Segmentation (IcgBench) dataset from Jakob Santner contains 243 images and 262 segmentation. Some images have multiple segmentations. The annota... interactive, segmentation, user link 2013-03-11 518
14 INRIA People The INRIA People dataset from Navneet Dalal and Bill Triggs [DalalCVPR2005] consists of training and testing data. The training contains 1805 images and X peopl... detection, pedestrian, sideview, frontview, human, boundingbox link 2015-06-19 1047
58 INRIA Horses The INRIA Horses dataset from Frederic Jurie and Vittorio Ferrari consists of 170 images with one or more horses in side-view at several scales and cluttered ba... detection, shape, segmentation, clutter, nature, horse link 2013-03-11 560
232 Pratheepan Human Skin Detection Dataset The images in this dataset are downloaded randomly from Google for human skin detection research. It has been used in the paper: W.R. Tan, C.S. Chan, Y. Prathee... skin detection, skin segmentation, human detection, skin dataset link 2016-11-05 1641
21 ImageNET The ImageNET dataset is the latest dataset by Li Fei-Fei containing various dataset ranging from 1000 to 10000 categories.... retrieval, segmentation, classification link 2013-03-11 720
108 ICG Sketch Retrieval The ICG Sketch Retrieval dataset consists of XXX hand-drawn sketches for five categories. It is used for content-based image retrieval using shape features for ... shape, matching, retrieval, partial, sketch n/a 2014-02-11 570
166 ICG Multi-Camera Datasets The ICG Multi-Camera datasets consist of Easy Data Set (just one person) Medium Data Set (3-5 persons, used for the experiments) Hard Data Set (crowded sc... multiview pedestrian tracking detection object camera calibration graz indoor video multitarget link 2015-06-19 973
165 ICG Multi-Camera and Virtual PTZ The ICG Multi-Camera and Virtual PTZ dataset contains the video streams and calibrations of several static Axis P1347 cameras and one panoramic video from a sph... multiview pedestrian tracking detection object camera calibration graz network video panorama crowd outdoor multitarget link 2015-06-19 1014
164 ICG Lab 6 (Multi-Camera Multi-Object Tracking) The ICG Lab 6 (Multi-Camera Multi-Object Tracking) dataset contains 6 indoor people tracking scenarios recorded at our laboratory using 4 static Axis P1347 came... multiview pedestrian tracking detection object laboratory camera calibration evaluation segmentation graz link 2013-10-08 1296
86 ICG Graz240 The ICG Graz240 dataset consists of 240 buildings with 5400 redundant images with a total of 5542 window instances. Window detection itself is difficult due to ... segmentation, detection, semantic, urban, graz link 2016-03-29 724
91 ICDAR 2003 The ICDAR 2003 datasets available for download on this site: Robust Reading , Robust Word Recognition , Robust OCR , Text Locating and Cursive Script . Pleas... text, detection, recognition, classification link 2013-03-12 741
80 Hopkins 155 The Hopkins 155 Dataset has been created with the goal of providing an extensive benchmark for testing feature based motion segmentation algorithms. It contains... flow, stereo, motion, segmentation, urban link 2015-04-01 818
286 HDA Person Dataset - ISR Lisbon The High Definition Analytics (HDA) dataset is a multi-camera High-Resolution image sequence dataset for research on High-Definition surveillance: Pedestrian De... Video Surveillance Pedestrian Detection Re-Identification Multiview Tracking Benchmark Indoor High-Definition Camera Network lisbon human link 2017-01-26 1216
118 Hermann Maier Nagano 1998 The Hermann Maier Nagano 1998 dataset is used for deformable extremely difficult tracking scenario. There are 4 parts to this sequence a) 00088 - 00554 Maie... tracking, single, trajectory, clutter, deformation n/a 2013-03-12 551
194 HCI 4D Lightfields The HCI 4D Lightfields dataset contains 11 objects with corresponding lightfields for depth estimation. Datasets can be downloaded individually below. For ma... 3d 4d lightfield benchmark depth reconstruction evaluation link 2017-03-01 861
307 HandNet annotated hand dataset The HandNet dataset contains depth images of 10 participants hands non-rigidly deforming infront of a RealSense RGB-D camera. This dataset includes 214971 a... hands articulated segmentation classification detection pose fingertip link 2015-09-07 579
23 Graz02 The Graz02 dataset by Andreas Opelt and Axel Pinz contains four categories of images: bikes, people, cars and a single background class. The annotation has been... bike, pedestrian, background, detection, clutter, car, graz link 2014-04-24 863
22 Graz01 The Graz01 dataset by Andreas Opelt and Axel Pinz contains four types of images: bikes, people, background with no bikes, background with no people.... bike, pedestrian, background, detection, clutter, graz, occlusion link 2013-08-08 836
52 Graffiti The Graffiti dataset by Krystian Mikolajczyk and Cordelia Schmid contains 48 images split into 8 sequences with 6 images each showing different structured and t... feature, detection, description, rectification, benchmark link 2017-02-23 639
125 Google Street View Pittsburgh Research The Google Street View Pittsburgh Research dataset is a street-level image collection provided by Google for research purposes. The dataset provided here co... 3d, reconstruction, sfm, urban, pittsburgh, panorama n/a 2016-07-25 1633
293 Google Street View Localization The Google Street View dataset contains 62,058 high quality Google Street View images. The images cover the downtown and neighboring areas of Pittsburgh, PA; Or... localization retrieval gps google streetview urban panorama pittsburgh address manhattan sphere link 2015-06-24 590
98 BSDS300 The goal of this work is to provide an empirical basis for research on image segmentation and boundary detection. To this end, we have collected 12,000 hand-la... segmentation, edge, contour, detection link 2013-03-12 709
79 LabelMe The goal of LabelMe is to provide an online annotation tool to build image databases for computer vision research. You can contribute to the database by visitin... segmentation, semantic, outdoor, detection, urban, software link 2013-03-14 630
142 German Traffic Sign Recognition Benchmark The German Traffic Sign Recognition Benchmark is a dataset for multi-class detection problem in natural images and do cordially invite you to participate. The b... detection, traffic, urban, recognition link 2016-08-15 1205
315 Geosemantic The Geosemantic is a dataset of object locations from GIS and a query image with metadata. It is used to project the buildings and streets that are in the field... semantic segmentation gps geography supervised gis link 2016-01-07 332
205 GaTech VideoStab The GaTech VideoStab dataset consists of N videos for the task of video stabilization. This code is implemented in Youtube video editor for stabilization. ... video stabilization camera path link 2013-10-09 679
203 GaTech VideoSeg The GaTech VideoSeg dataset consists of two (waterski and yunakim?) video sequences for object segmentation. There exists no groundtruth segmentation annotat... video segmentation object motion model camera link 2013-10-09 709
206 GaTech VideoContext The GaTech VideoContext dataset consists of over 100 groundtruth annotated outdoor videos with over 20000 frames for the task of geometric context evaluation i... video geometry context classification semantic segmentation unsupervised supervised outdoor urban nature link 2014-04-06 712
298 Freiburg-Berkeley Motion Segmentation The Freiburg-Berkeley Motion Segmentation Dataset (FBMS-59) is an extension of the BMS dataset with 33 additional video sequences. A total of 720 frames is anno... video segmentation benchmark object tracking pedestrian groundtruth motion link 2017-03-21 696
222 Ford Car Dataset The Ford Car dataset is joint effort of Pandey et al. (for collecting images, Lidar points, calibration etc.) and us (for annotation of 2D and 3D objects). ... car detection lidar 3d groundtruth sfm link 2014-04-16 1457
74 PMVS 3D Photography The following are multiview stereo data sets captured in our lab: a set of images, camera parameters and extracted apparent contours of a single rigid object. E... sfm, reconstruction, depth, dense, mesh link 2017-01-31 947
147 FlickrLogos-32 The FlickrLogos-32 dataset contains photos showing brand logos and is meant for the evaluation of multi-class logo recognition as well as logo retrieval methods... flickr, logo, detection, retrieval, image, object recognition, machine learning, classification brand boundingbox link 2017-01-11 935
226 Fish4Knowledge The Fish4Knowledge project ( is pleased to announce the availability of 2 subsets of our tropical coral reef fish video and extracted... classification animal fish video motion nature recognition water camera link 2014-05-15 784
189 Farman Institute 3D Point Sets The Farman Institute 3D Point Sets dataset contains 11 objects by a 3D laser scanner. This dataset was peer-reviewed by Image Processing On Line: Farman Institu... 3d laser scanner object reconstruction model point link 2013-09-18 599
257 FaceScrub The FaceScrub dataset comprises a total of 107818 unconstrained face images of 530 celebrities crawled from the Internet, with about 200 images per person. M... face detection recognition celebrity people human link 2014-11-24 676
310 FASSEG - FAce Semantic Segmentation The FAce Semantic SEGmentation (FASSEG) repository contains datasets for multi-class semantic face segmentation. The FASSEG repository is composed by two dat... face, segmentation link 2017-04-04 608
301 CMP Extreme Zoom Dataset The Extreme Zoom Dataset. EZD is a 6 image sets with incleasing zoom factor from general scene view to focusing on single detail. MODS: Fast and Robust Metho... feature detection description matching viewpoint zoom link 2015-07-15 363
316 Extreme Classification Repository The Extreme Classification Repository: Multi-label Datasets & Code Kush Bhatia • Himanshu Jain • Prateek Jain • Manik Varma The objective in extreme multi... machine learning multilabel classification benchmark evaluation link 2016-01-23 403
260 Eurasian Cities dataset The Eurasian Cities dataset contains 103 images of outdoor urban scenes taken in Eurasian cities. It is annotated with horizontal and vertical vanishing points ... vanishing line point geometry pose urban reconstruction outdoor manhattan link 2016-11-29 704
90 eTrims The eTrims dataset is comprised of two datasets, the 4-Class eTRIMS Dataset with 4 annotated object classes and the 8-Class eTRIMS Dataset with 8 annotated obje... semantic, segmentation, urban, reconstruction link 2013-03-12 551
75 ETHZ Shape The ETHZ Shape classes dataset from Vittorio Ferrari [?] consists of five object classes and a total of 255 images. All classes contain significant intra-class ... shape, detection, matching, segmentation, clutter, applelogo, bottle, giraffe, nature, swan, mug link 2014-02-11 652
56 ETHZ Extended Shape The ETHZ Extended Shape classes dataset from Konrad Schindler is larger dataset of shape categories, created by merging ETHZ shape classes with Konrad Schindler... detection, shape, segmentation, clutter link 2013-03-11 563
32 ECP Paris 2011 The ECP Paris 2011 dataset consists of 104 images taken from rue Monge in the fifth district of Paris, we kept only 20 for training and 10 for testing. Howev... segmentation, semantic, procedural, reconstruction, urban, paris link 2013-08-08 540
33 ECP New York 2011 The ECP New York dataset contains 10 manually segmented buildings from New York City, USA. Segmentation evaluating using Dice coefficient is calculated for the ... segmentation, semantic, procedural, reconstruction, urban, newyork link 2013-08-08 519
31 ECP Paris 2010 The Ecole Centrale Paris 2010 (Paris 2010) dataset consists of 30 images of densely annotated building facades in seven classes - wall, window, sky, shop, balco... segmentation, semantic, procedural, reconstruction, urban, paris link 2013-03-11 601
172 DynTex dataset The DynTex dataset consists of a comprehensive set of Dynamic Textures. Dynamic, or temporal, texture is a spatially repetitive, time-varying visual pattern tha... texture, segmentation, dynamic, synthetic, video repetition link 2013-08-12 633
131 Dubrovnik6K and Rome16K The Dubrovnik6K and Rome16K datasets are image collections for SfM reconstruction, where the suffix refers to the number of images in the dataset. Dubrovnik6... reconstruction, sfm, urban, landmark, dubrovnik, rome link 2017-03-10 809
53 DTU Robot The DTU Robot dataset consists of color images of 60 scenes acquired in a controlled setup from 119 different positions and under different lighting. For each s... feature, detection, description, matching, sfm, reconstruction, illumination link 2016-05-15 669
62 Deformed Lattice Detection The Deformed Lattice Detection In Real-World Images dataset is used for regular grid detection. The authors have developed a robust and fast lattice detection a... texture, segmentation, symmetry, lattice, detection, urban link 2013-03-11 613
136 3D Object in Clutter Recognition and Segmentation The dataset is composed of 150 synthetic scenes, captured with a (perspective) virtual camera, and each scene contains 3 to 5 objects. The model set is composed... recognition, segmentation, mesh, synthetic link 2013-08-08 748
88 Change Detection The dataset folder contains 7 folders (one for each category). Each category folder contains 4 to 6 folders (one for each video). Each video folder contains: ... change, detection, background link 2013-03-13 581
342 ICS-FORTH + Modelling of 2D Shapes with Ellipses The dataset contains more than 4,536 2D shapes included in standard as well as in home-build datasets. Our goal is to represent a given 2D shape with an au... shapes ellipses fitting modelling AIC link 2016-10-15 113
235 Kindergarten Video Surveillance The dataset consist of the about 50 hours obtained from kindergarten surveillance videos. Dataset, totally approximately 100 videos sequences (1000GB, 50 hours)... human action behavior segmentation video background surveillance link 2015-10-08 960
201 50 Salads The dataset captures 25 people preparing 2 mixed salads each and contains over 4h of annotated accelerometer and RGB-D video data. Annotated activities correspo... action activity recognition classification detection tracking video link 2013-10-05 717
368 Nude Detection Dataset — Videos (NPDI/DCC/UFMG) The database of nude and non-nude videos contains a collection of 179 video segments collected from the following movies: Alpha Dog, Basic Instinct, Before The ... nude detection, videos, movies link 2017-03-31 40
318 Mouse Embryo Tracking Database The database contains, for each of the 100 examples: (1) the uncompressed frames, up to the 10th frame after the appearance of the 8th cell; (2) a text file wit... tracking cell circle biology mouse trajectory link 2016-02-11 257
325 Synthesized Inverse Synthetic Aperture Radar (ISAR) Images of Aircrafts The database contains synthesized inverse synthetic aperture radar images of seven aircraft models. Reference: Hari Kishan Kondaveeti, Valli Kumari Va... ISAR, image, classification link 2016-03-17 405
324 Historical Car Database The database contains historical car images from 1920s to 1990s crawled from There are 10130 training and 3343 test images. Annotations incl... Car, Recognition, Time link 2016-03-17 465
369 Nude Detection Dataset — Images (NPDI/DCC/UFMG) The database contains 180 images collected from the Web. If you make use of our database, please cite the following reference: LOPES, Ana; AVILA, Sandra... nude detection, images link 2017-03-31 52
49 PhotoTourism Pair The data is taken from Photo Tourism reconstructions from Trevi Fountain (Rome), Notre Dame (Paris) and Half Dome (Yosemite). Each dataset consists of a series ... feature, matching, description, pair, sfm link 2013-03-11 709
269 Daimler Urban Segmentation Dataset The Daimler Urban Segmentation Dataset consists of video sequences recorded in urban traffic. The dataset consists of 5000 rectified stereo image pairs with a r... semantic segmentation outdoor urban stereo motion link 2015-06-26 838
190 Daimler Mono Pedestrian Detection Benchmark The Daimler Mono Pedestrian Detection Benchmark dataset contains a large training and test set. The training set contains 15.560 pedestrian samples (image cut-o... pedestrian detection outdoor urban mono scale object link 2013-09-18 772
191 Daimler Mono Pedestrian Classification Benchmark The Daimler Mono Pedestrian Classification Benchmark dataset consists of two parts: a base data set. The base data set contains a total of 4000 pedestrian- a... pedestrian classification outdoor urban object scale illumination link 2013-09-18 648
216 CVC Partial Occlusion Virtual Pedestrian The CVC Partial Occlusion Virtual Pedestrian datasets (CVC-01 to CVC-06) cover a range of scenarios of occluded pedestrians generated in a virtual and real envi... detection classification tracking pedestrian synthetic urban occlusion link 2016-03-15 1036
41 KTH Action The current video database containing six types of human actions (walking, jogging, running, boxing, hand waving and hand clapping) performed several times by 2... action, classification, video, segmentation link 2013-03-12 563
263 Crowd Dataset The crowd datasets are collected from a variety of sources, such as UCF and data-driven crowd datasets. The sequences are diverse, representing dense crowd in t... crowd video detection anomaly scene understanding human pedestrian link 2016-11-05 1108
309 Coutour patches The contour patches dataset is a large dataset of images patch matches used for contour detection. References: C. L. Zitnick and D. Parikh The Role of Im... patch image match contour edge lowlevel detection segmentation link 2015-09-29 356
278 Comprehensive Cars (CompCars) The Comprehensive Cars (CompCars) dataset contains data from two scenarios, including images from web-nature and surveillance-nature. The web-nature data contai... car vehicle recognition attribute classification fine-grained urban object link 2015-11-18 1025
319 Visual Search Patches The Compact Descriptors for Visual Search Patches Dataset (CDVS) is a dataset comprised of pairwise image patches. MPEG is a standard titled Compact Descriptor... patch matching retrieval descriptor feature mpeg link 2016-02-11 284
152 Colosseum and San Marco The Colosseum and San Marco are two image datasets for dense multiview stereo reconstructions used for evaluating the visual photo realism. The datasets are ... 3d, reconstruction, landmark, urban, sfm, aerial, streetside, flickr link 2015-05-04 1080
103 COIL-100 The COIL-100 (Columbia University Image Library) consists of 100 objects. For formal documentation look at the corresponding compressed technical report, [gzipp... classification, retrieval link 2013-03-12 576
124 CMU Geometric Context The CMU Geometric Context dataset by Derek Hoiem, Alexei A. Efros, Martial Hebert consists of 300 images used for training and testing the geometric context met... reconstruction, single view, depth, context, geometry link 2016-06-29 677
302 CMP map2photo The CMP map2photo dataset consists of 6 pairs, where one image is satellite photo and second image is a map of the same area. The task is to match these images... feature detection description matching map remote sensing wide baseline link 2015-08-13 420
179 CMP Facades The CMP Facade dataset consists of facade images assembled at the Center for Machine Perception, which includes 600 rectified images of facades from various sou... facade rectification urban semantic classification recognition structure similarity segmentation link 2015-06-19 572
193 City planar and non-planar The city planar and non-planar datset consists of urban scenes accompanied by text files describing the plane/non-plane locations. Training Set (University)... plane detection 3d urban building estimation link 2013-09-23 542
101 CIFAR-10 / 100 The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images. ... classification, tiny, color, patch, scene, object link 2013-08-08 624
94 Chars74K The Chars74K dataset consists of 64 classes (0-9, A-Z, a-z), 7705 characters obtained from natural images, 3410 hand drawn characters using a tablet PC, 62992 s... text, detection, recognition, classification link 2016-04-22 1092
171 CHALEARN Multi-modal Gesture Challenge The CHALEARN Multi-modal Gesture Challenge is a dataset +700 sequences for gesture recognition using images, kinect depth, segmentation and skeleton data. ht... gesture, kinect, recognition, human, action, illumination, depth, segmentation, skeleton link 2013-08-09 652
34 CamVid The Cambridge-driving Labeled Video Database (CamVid) dataset from Gabriel Brostow [?] contains ten minutes of video footage and corresponding semantically labe... sfm, depth, semantic, segmentation, urban link 2016-04-18 1878
78 Caltech Pedestrian The Caltech Pedestrian Dataset consists of approximately 10 hours of 640x480 30Hz video taken from a vehicle driving through regular traffic in an urban environ... pedestrian, detection, urban link 2016-06-07 1388
160 Caltech Lanes Dataset The Caltech Lanes dataset includes four clips taken around streets in Pasadena, CA at different times of day. The archive below includes 1225 individual frame... urban road lane detection caltech pasadena link 2013-08-08 793
159 Caltech Game Covers Dataset The Caltech Game Covers dataset consists of CD/DVD covers of video games. The set was downloaded from during the summer of 2008. The set includes... classification retrieval game cover caltech hierarchy taxonomy link 2014-02-20 567
158 Caltech Buildings Dataset The Caltech Buildings dataset consists of images taken for 50 buildings around the Caltech campus. Five different images were taken for each building from diffe... building urban retrieval hierarchy taxonomy caltech link 2013-08-08 600
20 CALTECH 256 The CALTECH 256 dataset by Li Fei-Fei contains 30607 images for 256 categories.... classification centered object scene image link 2013-08-08 631
19 CALTECH 101 The CALTECH 101 dataset by Li Fei-Fei contains images for 101 categories with about 40 to 800 images per category. Most categories have about 50 images at rough... classification centered object scene image link 2013-08-08 665
48 CALTECH 101 Category Patch Pairs The CALTECH 101 Category Patch Pairs dataset measures invariance to intra-category variation. The dataset contains a training set and testing set of image patc... feature, matching, description, pair, binary link 2017-02-14 2838
138 Buffy The Buffy dataset contains images selected from the TV series, Buffy: the Vampire Slayer. We select a set of 452 images from the first two episodes for training... segmentation, detection, buffy, movie, human link 2015-02-07 574
176 Brodatz Album The Brodatz dataset consists of 112 textures in grayscale images of various texture types. texture, segmentation, classification, benchmark, synthetic link 2014-12-23 888
106 BIWI Walking Pedestrians (EWAP) The BIWI Walking Pedestrians (EWAP) dataset shows walking pedestrians in busy scenarios from a bird eye view. Manually annotated. Data used for training in our ... detection, tracking, pedestrian, trajectory, crowd, overlap, occlusion, aerial link 2013-08-02 1290
297 Berkeley Video Segmentation The Berkeley Video Segmentation Dataset (BVSD) contains videos for segmentation (boundary?) Dataset train Dataset test... video segmentation benchmark link 2015-07-14 492
141 Berkeley Multimodal Human Action Database (MHAD) The Berkeley Multimodal Human Action Database (MHAD) contains 11 actions performed by 7 male and 5 female subjects in the range 23-30 years of age except for on... action classification multiview motion recognition link 2014-02-03 737
246 Bristol Egocentric Object Interactions Dataset The BEOID dataset includes object interactions ranging from preparing a coffee to operating a weight lifting machine and opening a door. The dataset is recorded... video interaction object ecocentric pose 3d tracking link 2014-09-30 829
50 Babenko tracking The Babenko tracking dataset contains 12 video sequences for single object tracking. For each clip they provide (1) a directory with the original image s... tracking single object animal face occlusion video link 2016-08-08 2099
313 Automotive Multi-sensor (AMUSE) The automotive multi-sensor (AMUSE) dataset consists of inertial and other complementary sensor data combined with monocular, omnidirectional, high frame rate v... streetside urban inertial video image traffic city api link 2015-11-15 533
187 Aspect Layout dataset The Aspect Layout dataset is designed to allow evaluation of object detection for aspect ratios in perspective images. Author text: In this project we see... detection object aspect ratio perspective layout link 2013-09-06 490
161 ICG Annotated Facial Landmarks in the Wild (AFLW) The Annotated Facial Landmarks in the Wild (AFLW) consists of a large-scale collection of annotated face images gathered from the web, exhibiting a large variet... face detection landmark pose age annotation link 2016-12-13 1451
181 All I Have Seen (AIHS) The All I Have Seen (AIHS) dataset is created to study the properties of total visual input in humans, for around two weeks Nebojsa Jojic wore a camera capturin... video summary user study clustering similarity outdoor indoor scene 3d link 2013-09-05 595
180 Airport MotionSeg The Airport MotionSeg dataset contains 12 sequences of videos of an aiprort scenario with small and large moving objects and various speeds. It is challenging b... motion segmentation airport video clustering camera zoom link 2013-09-04 737
84 Aachen Retrieval The Aachen dataset consists of 4479 images taken with multiple cameras (3GB), 369 query images taken with the camera of a mobile phone together with their SIFT ... retrieval, aachen, landmark, sfm, reconstruction link 2013-03-11 733
267 3DVis The 3DVis dataset includes a set of 12 heterogeneous scenes for testing 3D scene registration and analysis methods. Models include homogeneous shapes, repetitiv... 3d reconstruction matching registration shape symmetry link 2015-01-26 423
223 SHOT 3D shape description The 3D shape description dataset consists of multiple sub-datasets Descriptor Matching - Dataset 1 & 2 (Stanford) These datasets, created from some of the m... 3d shape description benchmark reconstruction registration matching link 2015-06-21 787
220 3D Mask Attack Dataset The 3D Mask Attack Database (3DMAD) is a biometric (face) spoofing database. It currently contains 76500 frames of 17 persons, recorded using Kinect for both re... 3d biometry face recognition segmentation frontview emotion link 2016-03-14 760
350 The 2D Shape Structure Dataset The 2D Shape Structure database is a public, user-generated dataset of 2D shape decompositions into a hierarchy of shape parts with geometric relationships reta... 2d shape decomposition, 2d shape hierarchy, 2d shape structure, Medial axis link 2016-11-17 191
303 1DSfM Landmarks The 1DSfM Landmarks is a collection of community-based image reconstruction by Kyle Wilson and is comprised of 14 datasets with comparison to bundler ground tru... 3d reconstruction landmark groundtruth benchmark urban city link 2015-08-05 485
249 Image Sequence Analysis Test Site (EISATS) The .enpeda.. Image Sequence Analysis Test Site (EISATS) offers sets of long bi- or trinocular image sequences recorded in the context of vision-based driver as... stereo vision optical flow motion analysis semantic segmentation link 2014-09-30 844
237 MOViCS video co-segmentation dataset The video co-segmentation dataset contains 4 video sets which totally has 11 videos with 5 frames of each video labeled with the pixel-level ground-truth.... video co-segmentation dataset link 2016-02-18 856
150 SDHA Contest The Semantic Description of Human Activities (SDHA) was a contest at ICPR 2010. The contest is composed of three different types of activity recognition cha... detection, tracking, pedestrian, trajectory, crowd, overlap, occlusion, aerial link 2013-07-31 753
238 Multiple Foreground Video Co-segmentation The multiple foreground video co-segmentation dataset, consisting of four sets, each with a video pair and two foreground objects in common. The dataset includ... video co-segmentation link 2014-08-14 697
264 Domain-specific Personal Videos Highlight Dataset The domain-specific personal videos highlight dataset from the paper [1] describes a fully automatic method to train domain-specific highlight ranker for raw p... video summarization saliency wearable human action recognition domain link 2015-05-02 537
221 EPFL Multi-View Cars Th EPFL Multi-View Car dataset contains 20 sequences of cars as they rotate by 360 degrees. There is one image approximately every 3-4 degrees. Using the time o... pose multiview car detection estimation rotation link 2014-02-10 852
308 TST Intake Monitoring dataBase t is composed of food intake movements, recorded with Kinect V1 (320×240 depth frame resolution), simulated by 35 volunteers for a total of 48 tests. The device... human food intake monitoring behavior kinect pointcloud tracking age groundtruth link 2016-02-11 328
253 Street View House Number (SVHN) SVHN is a real-world image dataset for developing machine learning and object recognition algorithms with minimal requirement on data preprocessing and formatti... streetview number recognition classification urban streetside detection text real world link 2016-08-24 811
95 Stroke Width Transform Text Stroke Width Transform Text dataset is by Boris Epstein and consists of 307 images and XXX text instances. Detecting Text in Natural Scenes with Stroke Wid... text, detection, recognition, classification link 2015-04-24 823
287 INRIA Lafarge Benchmarks Some datasets and evaluation tools are provided on this page for four different computer vision and computer graphics problems. Population counting Line-ne... 3d surface reconstruction groundtruth pointcloud object detection line road network urban crowd pedestrian counting link 2015-06-18 662
364 ETH CVL IMDB WIKI Faces Since the publicly available face image datasets are often of small to medium size, rarely exceeding tens of thousands of images, and often without age informat... face imdb wikipedia detection recognition age biometry link 2017-02-22 137
83 Ikonos Aerial Since its launch in September 1999, Space Imaging IKONOS earth imaging satellite has provided a reliable stream of image data that has become the standard for c... reconstruction, sfm, urban, aerial link 2013-03-11 743
87 Simpsons 40 years Simpsons Homer 40 years is a dataset showing Homer Simpson over the course of 40 years. It is used for video segmentation and shape matching between frames.... video, segmentation, shape, matching n/a 2016-04-19 628
116 Sheffield Building Sheffield Building Image Dataset consists of over 3,000 low-resolution images of forty different buildings – typically between 70 and 120 images per building. T... retrieval, classification, urban, sheffield link 2013-03-12 604
273 SBMI 2015 Scene Background Initialization (SBI) dataset The SBI dataset has been assembled in order to evaluate and compare the results of background initialization al... change detection background initialization foreground benchmark link 2015-05-02 415
360 ETHZ Multi-Person Tracking Robust Multi-Person Tracking from Mobile Platforms In all cases, data was recorded using a pair of AVT Marlins F033C mounted on a chariot respectively a car,... pedestrian, color, sequence, tracking link 0000-00-00 122
211 POSTECH Labeled Faces in the Wild POS Labeled Faces in the Wild, a collection of face which is proposed for studying face identification in unconstrained environment, its purpose is serving as a... face identification wild recognition registration link 2015-09-10 895
51 PN Learning PN Learning - How does TLD work? Tracking estimates the object location as long as the object is visible. During tracking all observed patterns of the object... singletarget tracking learning object pedestrian bike face link 2015-06-19 561
291 MIT Places205 Places205 dataase contains 2.5 million images from 205 scene categories for the academic public. The image dataset contains 2,448,873 images from 205 scene c... place recognition urban scene feature learning link 2016-02-24 601
262 PHOS (Evaluating illumination invariance) Phos is a color image database of 15 scenes captured under different illumination conditions. Every scene of the database contains 15 different images: 9 images... Illumination invariance, real lighting conditions, uneven illumination, shadows, feature detection link 2017-03-20 517
281 Tuberculosis image and patient data Permanently growing database on lung tuberculosis patients. The data include radiological images (CT+XRay) plus social, clinical, and lab data as well as full g... chest xray CT tuberculosis genome medical segmentation link 2016-08-06 560
113 Penn-Fudan Pedestrian Penn-Fudan Pedestrian Detection and Segmentation... pedestrian detection segmentation background motion link 2013-08-08 694
365 Pedestrian Color Naming Dataset Pedestrian Color Naming (PCN) dataset contains 14,213 images, each of which hand-labeled with color label for each pixel.... Pedestrian, segmentation, color naming link 2017-03-13 75
229 Paris Rue Madame Paris-rue-Madame dataset contains 3D Mobile Laser Scanning (MLS) data from rue Madame, a street in the 6th Parisian district (France). The test zone contains ap... semantic segmentation pointcloud 3d laser classification link 2014-06-10 584
61 Digital Papercutting Papercutting is a widespread and ancient artform which, as far as we could tell, had no previous computational treatment. We developed algorithms to analyze the... symmetry link 2013-03-11 417
115 Pankrac Marseille Our repetitive pattern dataset with 106 images of app. 30 buildings from Pankrac, Prague and Marseille appearing in more than one image, number of appearances r... classification, retrieval, symmetry, repetition, urban link 2013-03-13 537
96 USPS Handwritten Digits Name: Classes Train. Ex. Test. Ex. Features USPS 10 7291 2007 256 8-bit grayscale images of "0" through "9"; handwritten digits; ... text, recognition, classification, handwritten link 2013-03-12 766
2 MPEG-7 Core Experiment CE-Shape-1 MPEG-7 Core Experiment CE-Shape-1 [?] is a popular database for shape matching evaluation consisting of 70 shape categories, where each category is represented ... shape, binary, matching, retrieval, bullseye link 2017-03-02 1466
13 CBCL / MIT Pedestrian MIT Pedestrian dataset from Papageorgiou and Poggio [IJCV2000] contains 509 training and 200 test images of pedestrians in city scenes (plus left-right reflecti... pedestrian, frontview, detection, urban, people, boundingbox link 2015-06-19 769
146 Multiple Instance Learning dataset MIL data sets used in our 2002 NIPS paper for Elepphant, Musk, TREC machine learning, classification link 2013-05-30 634
289 ETHZ CVL Clust MICCAI 2015 Challenge on Liver Ultrasound Tracking Munich, October 9, 2015 (Full Day) Outline Ultrasound (US) imaging is a widely used medical imaging techn... medical liver tracking ultrasound therapy human organ benchmark real link 2015-06-19 360
248 VIDEO datasets overview Many different labeled video datasets have been collected over the past few years, but it is hard to compare them at a glance. So we have created a handy spread... video benchmark recognition classification detection object action link 2014-09-30 938
346 LASIESTA (Labeled and Annotated Sequences for Integral Evaluation of SegmenTation Algorithms) LASIESTA is composed by many real indoor and outdoor sequences organized in diferent categories, each of one covering a specific challenge in moving object dete... Database; Dataset; Ground-truth; Moving object detection; Foreground detection; Background subtraction; Challenges; Stationary foreground; Moving camera link 2016-10-31 169
145 KnapSack KNAPSACK_01 is a dataset directory which contains some examples of data for 01 Knapsack problems. In the 01 Knapsack problem, we are given a knapsack of fixe... machine learning, classification link 2013-05-31 650
219 JPL First-Person Interaction JPL First-Person Interaction dataset (JPL-Interaction dataset) is composed of human activity videos taken from a first-person viewpoint. The dataset particularl... video action recognition interactive motion human link 2014-02-03 542
275 TST fall detection It is composed of ADL (activity daily living) and fall actions simulated by 11 volunteers. The people involved in the test are aged between 22 and 39, with diff... action recognition detection depth kinect wearable accelerometer human video link 2017-03-14 736
283 ISPRS WG III/4 ISPRS Test Project on Urban Classification, 3D Building Reconstruction and Semantic Labeling. In this part of our working group site you will get further inform... aerial multiview 3d photogrammetry germany canada semantic segmentation urban city recognition benchmark link 2015-06-16 448
126 ISPRS Urban Classification ISPRS Test Project on Urban Classification and 3D Building Reconstruction The ISPRS working group III/4 announces the release of the 2D semantic labeling ben... 3d, reconstruction, building, urban, city, semantic, classification, recognition link 2014-11-24 624
282 ISPRS-EuroSDR HighDensity ISPRS and EuroSDR - Benchmark on High Density Aerial Image Matching Background and Scope of the project Innovations in matching algorithms as well as the... aerial multiview 3d photogrammetry germany switzerland urban city benchmark reconstruction link 2015-06-16 358
285 ISPRS-EuroSDR Multi-Platform ISPRS / EuroSDR Benchmark for Multi-Platform Photogrammetry In these pages you can get information about the BENCHMARK FOR MULTI-PLATFORM PHOTOGRAMMETRY unde... aerial multiview 3d photogrammetry germany switzerland urban city benchmark reconstruction link 2015-06-16 410
326 Desk3D (Cambridge University) Instance recognition from depth data. Contains various challenges of Pose, Clutter, Occlusion and similar looking objects (Bonde, U., Badrinarayanan, V., & Cipo... depth instance pose detection link 2016-04-15 353
134 Image Memorability Image memorability dataset contains target and filler images, precomputed features and annotations, and memorability. It gives features and annotations for t... aesthetics, semantic, quality, memorability link 2013-04-17 638
27 Idiap/ETHZ Faces and Poses Idiap/ETHZ Faces and Poses Dataset dataset by L. Jie, B. Caputo and V. Ferrari contains 1703 image-caption pairs. [author] Captions contain the names of some of... face, pose, pedestrian, text link 2013-03-11 649
236 iCoseg dataset iCoseg dataset introduces the largest publicly available co-segmentation dataset of 38 groups (643 images), along with pixel ground-truth hand annotations.... image co-segmentation link 2014-08-14 762
110 EITZ Sketch Quality Humans have used sketching to depict our visual world since prehistoric times. Even today, sketching is possibly the only rendering technique readily available ... shape, matching, retrieval, partial, sketch link 2014-02-11 571
143 KITTI Odometry Related Datasets TUM RGB-D Dataset: Indoor dataset captured with Microsoft Kinect and high-accuracy... registration, localization, odometry, slam, matching, navigation, urban path 3d reconstruction link 2013-09-30 886
294 Happy People Images Database Group emotion recognition in images - Happiness Intensity labels for group of people in images. The images have been collected from Flickr using keyword search ... group, facial expression, emotion, wild, human, flickr, behavior link 2015-07-13 539
348 Global Symmetry AVA Dataset Global Symmetry Ground-truth for AVA dataset Release Date: 2016 For detailed information, please refer to: Elawady, Mohamed, Cécile Barat, Christophe Duc... Global Bilateral Symmetry Detection Aesthetic Reflection Mirror link 2016-11-02 143
139 image panorama gdbicp Generalized Dual Bootstrap-ICP Algorithm ... registration, panorama, matching link 2013-05-21 547
335 General 100 General-100 dataset contains 100 bmp-format images (with no compression). We used this dataset in our FSRCNN ECCV 2016 paper. The size of these 100 images range... image superresolution link 2016-09-14 188
270 B3DO: Berkeley 3D Object Dataset For the first few decades of the fields existence, computer vision has been focused on algorithmic, logical approaches to perception. But it was only with the a... 3d kinect reconstruction indoor depth object recognition link 2015-03-16 490
230 FGVC-Aircraft Fine-Grained Visual Classification of Aircraft (FGVC-Aircraft) is a benchmark dataset for the fine grained visual categorization of aircraft. Data, annotatio... fine-grained classification recognition benchmark evaluation aircraft airplane link 2017-02-16 1208
354 Facial Expression Research Group Database (FERG-DB), University of Washington, Seattle FERG-DB is a database of stylized characters with annotated facial expressions. The database contains multiple face images of six stylized characters. The chara... Face, Facial expression, Animation, Stylization, annotation emotion, deep learning, anger, sad, joy, disgust, surprise, neutral, fear, cardinal classification, human transfer, image retrieval link 2017-02-27 183
69 HCI Robust Vision Estimate robust and reliable depth or motion fields on our challenging real world videos! ... flow, depth, stereo, outdoor link 2016-09-08 696
277 Detail 2D Projection DataSet Detail 2D Projection DataSet is a database of 2d projections of mechanical details with holes. The dataset consists of 13 shape categories where each category i... shape, holes, detail, binary, matching, retrieval link 2015-05-10 413
339 Annotated Web Ears Dataset (AWE Dataset) Dataset contains 1000 images of 100 persons, with 10 images per person and is freely available. All images were acquired by cropping ears from images from the i... ear biometry person pedestrian recognition human lighting link 2017-02-16 145
207 CASIA Gait Recognition Dataset Dataset A (former NLPR Gait Database) was created on Dec. 10, 2001, including 20 persons. Each person has 12 image sequences, 4 sequences for each of the three ... gait recognition biometry action classification motion human foot pressure link 2017-03-10 1946
340 Ljubljana CVL Face Database Database contains 798 images of 114 persons, with 7 images per person and is freely available for research purposes. All images were taken in supervised conditi... face pedestrian person recognition biometry human illumination lighting link 2017-02-22 177
77 Daimler Stereo Pedestrian Daimler Stereo Pedestrian Detection Benchmark C. Keller, M. Enzweiler, and D. M. Gavrila, A New Benchmark for Stereo-based Pedestrian Detection, Proc. of th... pedestrian, detection, urban link 2013-03-13 628
76 Daimler Pedestrian Classification Daimler Multi-Cue, Occluded Pedestrian Classification Benchmark Training and test samples have a resolution of 48 x 96 pixels with a 12-pixel border around t... detection, classification, pedestrian, urban link 2013-03-11 660
341 CVL OCR DB CVL OCR DB is a public annotated image dataset of 120 binary annotated (text/non-text) images of text in natural scenes. Images include signboards, shop names, ... OCR, sign recognition link 2016-10-13 119
239 CUHK crowd dataset CUHK crowd dataset introduces the largest publicly available crowd dataset of 474 videos from 215 crowded scenes. It has been used in the paper: Scene-Ind... crowd analysis, group detection and analysis, scene understanding link 2016-09-14 1026
353 COCO-Stuff COCO-Stuff augments the COCO dataset with pixel-level stuff annotations for 10,000 images. These annotations can be used for scene understanding tasks like sema... semantic segmentation stuff things COCO captioning annotation groundtruth benchmark link 2017-02-16 209
123 CMU/VMR Urban Image+Laser CMU/VMR Urban Image+Laser dataset contains 372 images linked with 3D laser points projections. There are additional images (due to the laser scanner being turne... reconstruction, sfm, urban, semantic, segmentation, laser link 2013-04-02 823
47 CMP Retrieval CMP Dataset by Ondra Chum contains 5 million images collected from the internet.... retrieval, urban, large scale link 2013-03-11 562
213 ChairGest Gestures ChairGest is an open challenge / benchmark. The task consists in spotting and recognizing gestures from multiple synchronized sensors: 1 Kinect and 4 Xsens Ine... benchmark recognition kinect gesture detection human link 2014-06-06 528
155 KUL Belgium Traffic Sign Classification BelgiumTSC dataset is built for traffic sign classification purposes. Is is a subset of BelgiumTS dataset and contains cropped images around annotations for 62 ... traffic sign classification urban road belgium link 2017-03-27 839
156 KUL Belgium Traffic Signs BelgiumTS is a large dataset with 10000+ traffic sign annotations, thousands of physically distinct traffic signs. 4 video sequences recorded with 8 high resolu... traffic sign classification urban road belgium camera calibration link 2017-03-27 1047
157 Background Models Challenge (BMC) Background Models Challenge (BMC) is a complete dataset and competition for the comparison of background subtraction algorithms. The main topics concern: -... background modeling change motion detection surveillance video segmentation link 2016-02-24 1144
357 udacity self-driving-car At Udacity, we believe in democratizing education. How can we provide opportunity to everyone on the planet? We also believe in teaching really amazing and usef... car robot driving autonomous street urban video recognition detection classification segmentation time synthetic link 2017-03-15 255
366 Multi-Camera Action Dataset An indoor action recognition dataset which consists of 18 classes performed by 20 individuals. Each action is individually performed for 8 times (4 daytime and ... Multi-Camera; Action Recognition; Cross-View Recognition; Open-View Recognition; link 2017-03-22 48
73 Strecha Dense MVS An evaluation benchmark for dense MVS for these datasets fountain-P11, Herz-Jesu-P8, entry-P10, castle-P19, Herz-Jesu-P25, castle-P30 . Images (corrected for... sfm, reconstruction, benchmark, depth, dense, mesh link 2014-11-11 1182
225 California-ND An Annotated Dataset For Near-Duplicate Detection In Personal Photo Collections Managing photo collections involves a variety of image quality assessment tas... retrieval duplicate copyright groundtruth detection link 2014-03-19 580
72 Acute3D Aiguille du Midi MVS Aiguille du Midi. France showing photographs with Camera: Mamiya ZD. 55mm. - Resolution: 5Mpixels, 53 images - Photographer: B. Vallet (Imagine/EVD - 2006) ... sfm, reconstruction, mesh, large scale, outdoor link 2013-03-21 712
132 Aesthetic Visual Analysis Aesthetic Visual Analysis (AVA) dataset studies the organization of content by aesthetic preference. It contains over 250,000 images along with a rich variety o... aesthetics, semantic, quality, memorability link 2017-01-10 924
119 AdelaideRMF AdelaideRMF: Robust Model Fitting Data Set AdelaideRMF is a data set for robust geometric model fitting (homography estimation and fundamental matrix estimat... feature, matching, getry, model link 2017-04-17 924
352 4D Light Field Dataset (HCI Heidelberg & CVIA Konstanz) A synthetic light field dataset with 24 scenes. Data provided for each scene: - 9x9x512x512x3 light fields as individual PNGs - config files with camera s... light field, ground truth, synthetic, disparity, depth link 2016-12-05 158
243 PEdesTrian Attribute (PETA) dataset A new large-scale PEdesTrian Attribute (PETA) dataset. The dataset is by far the largest of its kind, covering more than 60 attributes on 19000 images. In comp... Pedestrian, crowd link 2014-09-16 1552
329 Virginia Tech and Arab Academy for Science & Technology (VT-AAST) The VT-AAST Benchmarking Dataset A New Color Image Database for Benchmarking of Face Detection Techniques and Human Skin Segmentation Techniques​. A new color face image database for ... face, detection, skin, segmentation, benchmarking, link 2016-07-11 323
358 BYU+VT Small Aircraft Flight Encounters initial dataset A dataset of 11 encounters between two small Unmanned Aircraft Systems. The "host" UAS carries two stereo HD video cameras, a custom FM-CW radar, a PixHawk navi... uas unmanned aircraft sense avoid stereo flying flight byu vt radar link 2017-01-14 148
343 FIRE Fundus Image Registration Dataset A benchmark dataset for the evaluation of retinal image registration methods is introduced. The dataset consists on 134 image pairs and is annotated with ground... retina retinal image registration fundus eye link 2016-10-17 162
351 CMLA Subpixel Stereo Dataset A 66 stereo pairs dataset with their subpixel ground truths. The construction and improvement of algorithms for subpixel stereovision requires very precise t... stereo stereovision subpixel groundtruth 3D pointcloud noise depth link 2017-03-03 156
359 a a... segmentation n/a 2017-01-19 87
362 LITIV Datasets 3 datasets: PTZ Tracking, Thermal-visible registration, Single object tracking... Tracking, PTZ, Thermal, Pedestrian link 2015-09-30 153
224 CMP Extreme View Dataset 15 wide baseline stereo image pairs with large viewpoint change, provided ground truth homographies. Image size (~1000x700 pixels, RGB) D. Mishkin and M. ... feature detection description matching viewpoint link 2015-07-15 741
334 LabelMeFacade The LabelMeFacade dataset contains buildings, windows, sky and a limited number of unlabeled regions (maximally 20% covering of the image). This procedure res... segmentation semantic facade urban rectified recognition link 2016-08-23 302
228 MPI VehicleScenes Abstract Scene understanding has (again) become a focus of computer vision research, leveraging advances in detection, context modeling, and tracking. In thi... semantic segmentation scene understanding classification 3d car pedestrian link 2014-06-10 912
42 Hollywood Videos Hollywood-2 datset contains 12 classes of human actions and 10 classes of scenes distributed over 3669 video clips and approximately 20.1 hours of video in t... action, classification, video, segmentation link 2013-03-12 858
64 Middlebury Flow ... flow, depth link 2013-03-11 515
231 -- ... n/a 2016-03-04 1264
250 ... n/a 0000-00-00 747

total views: 259457 5 queries in 2.598762512207E-5s 2.2172927856445E-5s 7.8916549682617E-5s s 0.0030739307403564s and total 0.013320922851562s