Yet Another Computer Vision Index To Datasets (YACVID)

This website provides a list of frequently used computer vision datasets. Wait, there is more!
There is also a description containing common problems, pitfalls and characteristics and now a searchable TAG cloud.
Plus, this is open for crowd editing (if you pass the ultimate turing test)! - Questions? yacvid [at] hayko [dot] at

Content, Design and Idea © by Hayko Riemenschneider, 2011-2018. Texts and Images are subject of copyright by the respective authors.

Hey! If you're reading this, why not help and update the description of the dataset you're working on? Add a new dataset! Yay!



2d   3d   4d   aachen   abdomen   abrupt   accelerometer   accuracy   action   activity   actor   address   adhead   adjustment   adult   aerial   aesthetics   affordance   age   aircraft   airplane   airport   alignment   amazon   ambiguous   analysis   anger   animal   animation   annotation   anomaly   apartment   api   appearance   applelogo   architecture   articulation   artificial   aspect   asset   atmospheric   attention   attribute   australia   authentication   automatic   autonomous   avoid   axis   babyface   background   balance   bark   baseline   behavior   belgium   benchmark   benchmarking   berlin   bike   bilateral   bim   binary   biology   biometric   biometry   blender   blur   boat   body   bone   bottle   boundingbox   brain   brand   bremen   buffy   building   bullseye   bundle   bunny   byu   cad   calibration   california   caltech   camera   canada   caption   captioning   capture   car   cardinal   categorization   category   cats   cbir   celebrity   cell   centered   ceramics   chair   challenge   change   chemistry   chest   chicaco   chromaticity   church   circle   city   cityscapes   classification   clinical   clothing   cloud   clustering   clutter   cnn   co-localization   co-saliency   co-segmentation   co-skeletonization   coco   code   codebook   coffee   collaboration   collaborative   color   community   comparison   computer   condition   constancy   context   contour   cooking   copyright   counting   cover   cow   crepe   crf   crop   cross-view   crowd   ct   cutting   daily   dance   dark   data   dataset   day   daylight   decomposition   deep   defocus   deformation   denoising   dense   depth   description   descriptor   detail   detection   dichromatic   disease   disgust   disparity   dna   dogs   domain   dped   driving   drone   dublin   dubrovnik   duplicate   dynamic   ear   edge   egocentric   ellipse   emotion   empty   endtoend   enhancement   environment   estimation   evaluation   event   exhibit   expertise   expression   eye   facade   face   facial   fake   family   fashion   fear   feature   field   fine-grained   fingerprint   fingertip   first-person   fish   fisheye   fitting   flickr   flight   floorplan   flow   fly   flying   fog   food   foot   footprint   foreground   forensics   fov   frames   frontview   fundus   gait   game   gan   gaze   gender   genetic   genome   geography   geometry   geoscience   geotag   geotagged   germany   gesture   getry   gif   giraffe   gis   glassware   global   google   gps   grammar   graphics   grayscale   graz   ground   groundtruth   group   growth   gsd   hand   handwritten   hd   head   heart   heat   hierarchy   high-definition   high-resolution   highlight   highway   holes   horse   hospital   house   howto   human   identification   illumination   illusion   image   imagenet   images   imdb   imu   indigenous   indoor   inertial   initialization   inserts   instance   intake   intensity   interaction   interactive   interest   internet   invariance   ir   isar   iso   joy   kaggle   kernels   keyframe   kimia   kinect   kinship   kitchen   kitti   label   labeling   laboratory   land   landmark   lane   language   large   large-scale   laser   lattice   layout   leaf   learning   letter   leuven   lidar   lifespan   light   lightfield   lighting   limited   line   lip   lisbon   liver   local   localization   location   logo   low-light   lowlevel   machine   makeup   manhattan   map   maritime   mask   match   matching   material   medial   medical   medicine   memorability   mesh   metadata   milling   mirror   mobile   mocap   model   modeling   monitoring   mono   montage   motion   motorbike   mouse   mouth   movement   movie   mpeg   mser   mug   multi-agent   multi-camera   multi-class   multi-human   multi-mode   multi-sensor   multi-spectral   multi-view   multilabel   multimedia   multimodal   multiple   multispectral   multitarget   multiview   museum   naming   natural   nature   navigation   netherlands   network   neutral   newyork   night   nir   noise   normal   nude   number   object   occlusion   ocr   odometry   omnidirection   omnidirectional   online   open-view   operation   optical   optimization   organ   original   osnabrueck   outdoor   overhead   overlap   oxford   paintings   pair   pairwise   pan   panchromatic   panorama   panoramio   parallel   paris   parsing   part   partial   pasadena   pascal   patch   path   pattern   pedestrian   people   person   perspective   phase   photo   photo-realistic   photogrammetry   physics   pittsburgh   place   plane   planning   plant   point   pointcloud   polygon   popularity   pornography   pose   potsdam   presentation   pressure   primitive   privacy   procedural   product   profile   project   proposal   pruning   ptz   quality   question   radar   random   rank   ranking   ransac   rate   ratio   re-identification   reading   real   real-world   realism   recipe   recognition   reconstruction   rectification   rectified   reflection   registration   regression   regular   relationship   remote   removal   render   rendering   repetition   resolution   restoration   retina   retinal   retrieval   rgb   rgbd   road   robot   robotic   robust   rome   room   ros   rotation   sad   saliency   sampling   sanfrancisco   satellite   scale   scan   scanner   scene   sculptures   search   segmentation   selfdriving   semantic   sense   sensing   sequence   series   sfm   shadow   shape   sheffield   shoe   shots   shutter   sideview   sign   similarity   simulation   simultaneous   single   singleview   size   skeleton   skeletonization   sketch   skin   sky   slam   smartphone   soccer   social   software   source   space   spain   spanish   speaker   speech   speed   sphere   sport   stability   stabilization   static   stationary   steganalysis   steganography   stereo   stereovision   stochastic   street   structure   structured   study   stuff   style   stylization   subpixel   subtraction   summarization   summary   superpixel   superresolution   supervised   supervisely   surface   surgery   surprise   surveillance   surveying.   swan   switzerland   sydney   symmetry   synthetic   table   target   task   taxonomy   temporal   text   textile   texture   texture-less   therapy   thermal   things   time   time-lapse   timelapse   timepieces   tiny   tokyo   tool   top-view   topcoder   tracking   tracklet   traffic   trajectory   transfer   transportation   tree   triangulation   truth   tuberculosis   turbulence   type   uas   uav   udacity   ultrasound   understanding   uneven   unmanned   unsupervised   urban   user   vanishing   variation   vegetation   vehicle   velodyne   vessel   video   view   viewpoint   virtual   visible   vision   visual   voc   volleyball   vqa   vt   water   wavelength   weakly   wear   wearable   weather   webcam   white   wide   wiki   wikipedia   wild   workflow   world   worldwide   xray   year   youtube   zoom   zurich  
«showing 682 tags of 682 total tags for 477 datasets (1.43) »


recognition
DID Name Description Tags URL Date Views
478 UE4Sim and Sim4CV Sim4CV is the general environment for simulating data for computer vision tasks, like object tracking, pose estimation, detection, action recognition, indoor sc... object tracking, pose estimation, detection, action recognition, indoor scene understanding, multi-agent collaboration, autonomous navigation, 3d reconstruction, crowd understanding, urban scene understanding, human tracking, aerial surveying. simulation environment 3d photo-realistic realism depth segmentation urban rgb render link 2018-11-30 34
475 MAE Dataset The Multimodal Attribute Extraction (MAE) dataset is the first benchmark dataset for the task of multimodal attribute extraction. It is composed of mixed media ... multimedia multimodal images text attribute recognition pair product search asset retrieval link 2018-11-20 12
474 Open MIC Open MIC (Open Museum Identification Challenge) contains photos of exhibits captured in 10 distinct exhibition spaces of several museums which showcase painting... museum recognition identification benchmark exhibit image paintings, timepieces, sculptures, glassware, ceramics, indigenous link 2018-10-24 35
469 ALASKA ALASKA is the second contest on steganalysis ; after a fruitful first contest, called BOSS and organized in 2010, which give birth to the development of large f... steganalysis steganography image recognition challenge forensics link 2018-09-07 45
465 IMDb-Face IMDb-Face is a new large-scale noise-controlled dataset for face recognition research. The dataset contains about 1.7 million faces, 59k identities, which is ma... Face recognition link 2018-09-06 65
463 Families In The Wild (FIW) Database Families In The Wild (FIW) Database is the largest and most comprehensive database available for kinship recognition. FIW is made up of 11,932 natural family p... recognition kinship family relationship dna similarity face link 2018-08-08 69
452 INRIA Praxis Gesture PRAXIS GESTURE DATASET is a new challenging RGB-D upper-body gesture dataset recorded by Kinect v2. The dataset is unique in the sense that it addresses the Pra... gesture rgbd body activity action kinect recognition taxonomy link 2018-04-16 121
434 Online RGBD Action Dataset (ORGBD) The Online RGBD Action dataset targets for human aciton (human-object interaction) recognition based on RGBD video data. There are seven categories of human act... action rgbd online human recognition daily link 2018-03-15 169
433 20bn-Something-Something The 20BN-SOMETHING-SOMETHING dataset is a large collection of densely-labeled video clips that show humans performing pre-defined basic actions with everyday ob... action recognition human video daily link 2018-03-15 178
423 Makeup Induced Face Spoofing (MIFS) The Makeup Induced Face Spoofing (MIFS) dataset consists of 107 makeup transformations taken from random YouTube makeup video tutorials. Each subject is attempt... face recognition makeup illusion fake accuracy link 2018-02-28 311
420 ATLAS Dione Robot-Assisted Surgery Video Understanding Dataset by Roswell Park Cancer Institute / Th ATLAS Dione dataset provides video data (86 full subject study videos (~910 action clips)) of ten surgeons from Roswell Park Cancer Institute (RPCI) (Buffalo, N... robotic surgery tool detection object detection action recognition expertise VOC video gesture link 2018-03-22 250
417 Visual Lip Reading Feasibility (VRLF) The VLRF database is designed with the aim to contribute to research in visual only speech recognition. A key difference of the VLRF database with respect to ex... lip reading recognition speaker spanish language mouth face speech link 2017-11-07 234
415 Total Text Dataset In order to facilitate a new text detection research, we introduce the Total-Text dataset, which is more comprehensive than the existing text datasets. The Tota... text detection, text recognition, scene text detection link 2017-11-02 467
410 Charades Activity Dataset 10,000 30sec videos from 267 volunteers, each annotated with multiple activities, captions, objects, and temporal localizations. From "Hollywood in Homes: Cr... video activity recognition action object caption localization detection human daily link 2018-03-22 358
406 Swedish Traffic Sign Recognition The Swedish Traffic Sign Recognition provides Matlab code for parsing the annotation files and displaying the results. Part0 for each set contains the annotated... traffic sign recognition detection urban city link 2017-09-12 563
396 ADE20k Scene Parsing Benchmark Scene parsing data and part segmentation data derived from ADE20K dataset could be download from MIT Scene Parsing Benchmark. mages ... segmentation semantic annotation benchmark scene recognition link 2017-08-03 381
395 AWS Public Datasets AWS hosts a variety of public datasets that anyone can access for free. Previously, large datasets such as satellite imagery or genomic data have required hour... amazon aerial classification deep learning segmentation recognition satellite human biology space image resolution link 2018-10-26 651
392 M2CAI 2016 Challenge These datasets were generated for the M2CAI challenges, a satellite event of MICCAI 2016 in Athens. Two datasets are available for two different challenges: m2c... medicine video recognition surgery workflow challenge link 2017-07-11 363
391 xawAR16 The xawAR16 dataset is a multi-RGBD camera dataset, generated inside an operating room (IHU Strasbourg), which was designed to evaluate tracking/relocalization ... medicine video recognition surgery table operation depth link 2017-07-11 273
390 Cholec80 The Cholec80 dataset contains 80 videos of cholecystectomy surgeries performed by 13 surgeons. The videos are captured at 25 fps. The dataset is labeled with th... medicine video recognition surgery phase tool link 2018-12-07 348
389 action recognition benchmark We wanted to have a collection of action recognition papers and results that everybody can use for reference. The site will work by the community principle, so ... action recognition benchmark dataset link 2017-07-11 338
379 Crepe Cooking Dataset The Crepe Dataset provides 6 different types of structured activity videos in 1920x1080 resolution. Each activity is represented as a sequence of different acti... structured activity action recognition cooking recipe crepe simultaneous parallel link 2017-05-19 348
378 TVPR (Top View Person Re-identification) The TVPR dataset includes 23 registration sessions. Each of the 23 folders contains the video of one registration session. Acquisitions have been performed duri... person re-identification identification recognition people gender clothing video depth top-view indoor link 2018-01-25 515
376 ScanNet ScanNet is an RGB-D video dataset containing 2.5 million views in more than 1500 scans, annotated with 3D camera poses, surface reconstructions, and instance-le... scene indoor synthetic cad room layout rendering realism 3d segmentation object recognition link 2017-05-12 469
375 SUNCG: Indoor Scenes The SUNCG dataset is a Large 3D Model Repository for Indoor Scenes. SUNCG is an ongoing effort to establish a richly-annotated, large-scale dataset of 3D s... scene indoor synthetic room layout rendering realism 3d segmentation object recognition link 2018-10-17 610
366 Multi-Camera Action Dataset An indoor action recognition dataset which consists of 18 classes performed by 20 individuals. Each action is individually performed for 8 times (4 daytime and ... indoor video Multi-Camera Action Recognition Cross-View Recognition Open-View Recognition link 2017-09-12 490
364 ETH CVL IMDB WIKI Faces Since the publicly available face image datasets are often of small to medium size, rarely exceeding tens of thousands of images, and often without age informat... face imdb wikipedia detection recognition age biometry link 2017-02-22 555
357 udacity self-driving-car At Udacity, we believe in democratizing education. How can we provide opportunity to everyone on the planet? We also believe in teaching really amazing and usef... car robot driving autonomous street urban video recognition detection classification segmentation time synthetic link 2017-03-15 940
356 The Oxford RobotCar Dataset The Oxford RobotCar Dataset contains over 100 repetitions of a consistent route through Oxford, UK, captured over a period of over a year. The dataset captures ... car robot driving autonomous street urban video recognition detection classification segmentation time year link 2017-01-04 931
341 CVL OCR DB CVL OCR DB is a public annotated image dataset of 120 binary annotated (text/non-text) images of text in natural scenes. Images include signboards, shop names, ... OCR, sign recognition link 2016-10-13 476
340 Ljubljana CVL Face Database Database contains 798 images of 114 persons, with 7 images per person and is freely available for research purposes. All images were taken in supervised conditi... face pedestrian person recognition biometry human illumination lighting link 2017-02-22 817
339 Annotated Web Ears Dataset (AWE Dataset) Dataset contains 1000 images of 100 persons, with 10 images per person and is freely available. All images were acquired by cropping ears from images from the i... ear biometry person pedestrian recognition human lighting link 2017-02-16 685
337 WIDER Attribute Dataset WIDER ATTRIBUTE dataset is a human attribute recognition benchmark dataset, of which images are selected from the publicly available WIDER dataset. There are a ... Attribute recognition, Human attribute link 2016-09-22 1143
334 LabelMeFacade The LabelMeFacade dataset contains buildings, windows, sky and a limited number of unlabeled regions (maximally 20% covering of the image). This procedure res... segmentation semantic facade urban rectified recognition link 2016-08-23 947
324 Historical Car Database The database contains historical car images from 1920s to 1990s crawled from cardatabase.net. There are 10130 training and 3343 test images. Annotations incl... Car, Recognition, Time link 2016-03-17 956
291 MIT Places205 Places205 dataase contains 2.5 million images from 205 scene categories for the academic public. The image dataset contains 2,448,873 images from 205 scene c... place recognition urban scene feature learning link 2016-02-24 1140
288 Berkeley Urban Street tracking The UrbanStreet dataset used in the paper can be downloaded here [188M] . It contains 18 stereo sequences of pedestrians taken from a stereo rig mounted on a ca... tracking detection segmentation multitarget recognition video pedestrian urban human link 2015-07-14 1574
283 ISPRS WG III/4 ISPRS Test Project on Urban Classification, 3D Building Reconstruction and Semantic Labeling. In this part of our working group site you will get further inform... aerial multiview 3d photogrammetry germany canada semantic segmentation urban city recognition benchmark link 2015-06-16 972
280 Yahoo Flickr Creative Commons 100M Yahoo Flickr Creative Commons 100M (YFCC100M) dataset contains a list of photos and videos. This list is compiled from data available on Yahoo! Flickr. All the ... flickr landmark image recognition detection reconstruction 3d clustering social community internet link 2015-09-24 1273
279 WWW Crowd The Where Who Why (WWW) dataset provides 10,000 videos with over 8 million frames from 8,257 diverse scenes, therefore offering a superior comprehensive dataset... surveillance crowd pedestrian detection recognition flow optical video link 2018-10-07 1466
278 Comprehensive Cars (CompCars) The Comprehensive Cars (CompCars) dataset contains data from two scenarios, including images from web-nature and surveillance-nature. The web-nature data contai... car vehicle recognition attribute classification fine-grained urban object link 2018-09-04 2157
276 TST TUG (Timed Up and Go) The TUG (Timed Up and Go test) dataset consists of actions performed three times by 20 volunteers. The people involved in the test are aged between 22 and 39, w... action recognition time kinect wearable accelerometer human video link 2015-05-02 807
275 TST fall detection It is composed of ADL (activity daily living) and fall actions simulated by 11 volunteers. The people involved in the test are aged between 22 and 39, with diff... action recognition detection depth kinect wearable accelerometer human video link 2017-03-14 1191
274 UBO 2014 Materials The UBO 2014 consists of 7 semantic categories. Each of these 7 material categories contains measurements of 12 different material instances for being capable t... material light illumination texture classification recognition link 2018-03-10 814
272 Stanford 40 Actions The Stanford 40 Actions dataset contains images of humans performing 40 actions. In each image, we provide a bounding box of the person who is performing the ac... human action recognition detection boundingbox link 2015-06-19 1320
271 Labeling in 3D Scenes This dataset package contains the software and data used for Detection-based Object Labeling on the RGB-D Scenes Dataset as implemented in the paper: Detecti... 3d kinect reconstruction indoor depth object recognition link 2015-03-16 1045
270 B3DO: Berkeley 3D Object Dataset For the first few decades of the fields existence, computer vision has been focused on algorithmic, logical approaches to perception. But it was only with the a... 3d kinect reconstruction indoor depth object recognition link 2015-03-16 928
266 Paris Art Deco Facades The Paris Art Deco Facades dataset consists of 79 / 80 images of rectified facades of the architectural style Art Deco, which has different sizes of windows, de... paris semantic segmentation recognition architecture facade urban city procedural grammar link 2015-01-20 922
264 Domain-specific Personal Videos Highlight Dataset The domain-specific personal videos highlight dataset from the paper [1] describes a fully automatic method to train domain-specific highlight ranker for raw p... video summarization saliency wearable human action recognition domain link 2015-05-02 1067
258 Visual Attributes dataset The Visual Attributes dataset contains visual attribute annotations for over 500 object classes (animate and inanimate) which are all represented in ImageNet. E... classification recognition attribute imagenet object link 2018-06-19 1179
257 FaceScrub The FaceScrub dataset comprises a total of 107818 unconstrained face images of 530 celebrities crawled from the Internet, with about 200 images per person. M... face detection recognition celebrity people human link 2018-06-30 1237
254 ChokePoint Dataset We collected a video dataset, termed ChokePoint, designed for experiments in person identification/verification under real-world surveillance conditions using e... human pedestrian identification recognition multiview sequence face detection real world surveillance clustering link 2015-05-02 1595
253 Street View House Number (SVHN) SVHN is a real-world image dataset for developing machine learning and object recognition algorithms with minimal requirement on data preprocessing and formatti... street number recognition classification urban detection text real world link 2017-11-28 1213
252 Volleyball Activity Dataset 2014 This dataset contains 7 challenging volleyball activity classes annotated in 6 videos from professionals in the Austrian Volley League (season 2011/12). A total... action activity sport volleyball detection recognition video analysis link 2017-07-05 1974
251 ETHZ CVL RueMonge 2014 This ETHZ CVL RueMonge 2014 dataset used for 3D reconstruction and semantic mesh labelling for urban scene understanding. It was first published in [1] and p... semantic segmentation 3d reconstruction architecture paris benchmark source code urban recognition classification outdoor pointcloud mesh link 2014-11-24 1771
248 VIDEO datasets overview Many different labeled video datasets have been collected over the past few years, but it is hard to compare them at a glance. So we have created a handy spread... video benchmark recognition classification detection object action link 2018-04-23 1394
247 PASCAL VOC Parts The PASCAL VOC is augmented with segmentation annotation for semantic parts of objects. For example, for the person category, we provide segmentation mask for 2... detection recognition pascal object part pedestrian human segmentation semantic link 2014-09-30 1570
240 Microsoft COCO The Microsoft COCO (mscoco) is an image recognition and segmentation dataset which contains more 300k images for more than 70 categories. Other features: Mo... object context segmentation detection recognition benchmark semantic link 2015-05-02 1767
234 UMD Dynamic Scene Recognition The UMD Dynamic Scene Recognition dataset consists of 13 classes and 10 videos per class and is used to classify dynamic scenes. The dataset has been describ... scene recognition classification dynamic video motion link 2017-01-05 1201
233 PASCAL Context We would like to announce the release of PASCAL-Context dataset. We augmented PASCAL VOC 2010 dataset with annotations for 400+ additional categories. In the cu... semantic segmentation pascal benchmark category recognition dense shape link 2014-07-17 1206
230 FGVC-Aircraft Fine-Grained Visual Classification of Aircraft (FGVC-Aircraft) is a benchmark dataset for the fine grained visual categorization of aircraft. Data, annotatio... fine-grained classification recognition benchmark evaluation aircraft airplane link 2018-06-07 2015
227 Omnidirectional and panoramic image dataset We share our omnidirectional and panoramic image dataset (with annotations) to be used for human and car detection. Please reach through: http://cvrg.iyte.edu.... panorama detection car omnidirection human recognition link 2017-01-13 1742
226 Fish4Knowledge The Fish4Knowledge project (groups.inf.ed.ac.uk/f4k/) is pleased to announce the availability of 2 subsets of our tropical coral reef fish video and extracted... classification animal fish video motion nature recognition water camera link 2014-05-15 1282
220 3D Mask Attack Dataset The 3D Mask Attack Database (3DMAD) is a biometric (face) spoofing database. It currently contains 76500 frames of 17 persons, recorded using Kinect for both re... 3d biometry face recognition segmentation frontview emotion link 2016-03-14 1412
219 JPL First-Person Interaction JPL First-Person Interaction dataset (JPL-Interaction dataset) is composed of human activity videos taken from a first-person viewpoint. The dataset particularl... video action recognition interactive motion human link 2014-02-03 879
213 ChairGest Gestures ChairGest is an open challenge / benchmark. The task consists in spotting and recognizing gestures from multiple synchronized sensors: 1 Kinect and 4 Xsens Ine... benchmark recognition kinect gesture detection human link 2014-06-06 907
211 POSTECH Labeled Faces in the Wild POS Labeled Faces in the Wild, a collection of face which is proposed for studying face identification in unconstrained environment, its purpose is serving as a... face identification wild recognition registration link 2015-09-10 1471
207 CASIA Gait Recognition Dataset Dataset A (former NLPR Gait Database) was created on Dec. 10, 2001, including 20 persons. Each person has 12 image sequences, 4 sequences for each of the three ... gait recognition biometry action classification motion human foot pressure link 2017-03-10 2704
201 50 Salads The dataset captures 25 people preparing 2 mixed salads each and contains over 4h of annotated accelerometer and RGB-D video data. Annotated activities correspo... action activity recognition classification detection tracking video link 2013-10-05 1124
200 Landmark 3D This dataset provides a collection of web images and 3D models for research on landmark recognition (especially for methods based on 3D models). We hope it coul... landmark recognition classification retrieval 3d reconstruction codebook matching feature flickr link 2016-08-09 1320
192 Our Database of Faces The Our Database of Faces (ORL) dataset contains ten different images of each of 40 distinct subjects. For some subjects, the images were taken at different tim... face recognition illumination human expression link 2013-09-23 1219
188 KTH Multiview Football The KTH Multiview Football dataset contains 771 images of football players includes images taken from 3 views at 257 time instances 14 annotated body joints. ... multiview pedestrian tracking detection object camera outdoor game soccer pose recognition multitarget link 2018-06-28 1991
182 MSR Action The MSR Action datasets is a collection of various 3D datasets for action recognition. See details http://research.microsoft.com/en-us/um/people/zliu/action... video action recognition detection reconstruction 3d link 2013-09-05 1241
179 CMP Facades The CMP Facade dataset consists of facade images assembled at the Center for Machine Perception, which includes 600 rectified images of facades from various sou... facade rectification urban semantic classification recognition structure similarity segmentation link 2015-06-19 999
174 Pittsburgh Fast-food Image dataset The Pittsburgh Fast-food Image dataset (PFID) consists of 4545 still images, 606 stereo pairs, 3033600 videos for structure from motion, and 27 privacy-preservi... food recognition classification reconstruction video laboratory real link 2018-05-30 2266
171 CHALEARN Multi-modal Gesture Challenge The CHALEARN Multi-modal Gesture Challenge is a dataset +700 sequences for gesture recognition using images, kinect depth, segmentation and skeleton data. ht... gesture, kinect, recognition, human, action, illumination, depth, segmentation, skeleton link 2013-08-09 1090
170 Sheffield Kinect Gesture (SKIG) dataset The Sheffield Kinect Gesture (SKIG) dataset contains 2160 hand gesture sequences (1080 RGB sequences and 1080 depth sequences) collected from 6 subjects. ... gesture, kinect, recognition, human, action, illumination, depth link 2017-12-02 1514
153 MSRC Kinect Gesture Dataset The Microsoft Research Cambridge-12 Kinect gesture dataset consists of sequences of human movements, represented as body-part locations, and the associated gest... gesture, kinect, recognition, human, action link 2013-08-08 1194
147 FlickrLogos-32 The FlickrLogos-32 dataset contains photos showing brand logos and is meant for the evaluation of multi-class logo recognition as well as logo retrieval methods... flickr, logo, detection, retrieval, image, object recognition, machine learning, classification brand boundingbox link 2018-03-08 1629
142 German Traffic Sign Recognition Benchmark The German Traffic Sign Recognition Benchmark is a dataset for multi-class detection problem in natural images and do cordially invite you to participate. The b... detection, traffic, urban, recognition link 2016-08-15 1807
141 Berkeley Multimodal Human Action Database (MHAD) The Berkeley Multimodal Human Action Database (MHAD) contains 11 actions performed by 7 male and 5 female subjects in the range 23-30 years of age except for on... action classification multiview motion recognition link 2014-02-03 1212
136 3D Object in Clutter Recognition and Segmentation The dataset is composed of 150 synthetic scenes, captured with a (perspective) virtual camera, and each scene contains 3 to 5 objects. The model set is composed... recognition, segmentation, mesh, synthetic link 2013-08-08 1147
126 ISPRS Urban Classification ISPRS Test Project on Urban Classification and 3D Building Reconstruction The ISPRS working group III/4 announces the release of the 2D semantic labeling ben... 3d, reconstruction, building, urban, city, semantic, classification, recognition link 2014-11-24 1023
96 USPS Handwritten Digits Name: Classes Train. Ex. Test. Ex. Features USPS 10 7291 2007 256 8-bit grayscale images of "0" through "9"; handwritten digits; ... text, recognition, classification, handwritten link 2013-03-12 2016
95 Stroke Width Transform Text Stroke Width Transform Text dataset is by Boris Epstein and consists of 307 images and XXX text instances. Detecting Text in Natural Scenes with Stroke Wid... text, detection, recognition, classification link 2015-04-24 1339
94 Chars74K The Chars74K dataset consists of 64 classes (0-9, A-Z, a-z), 7705 characters obtained from natural images, 3410 hand drawn characters using a tablet PC, 62992 s... text, detection, recognition, classification link 2018-08-28 2028
93 Street View Text The Street View Text (SVT) dataset contains 647 words and 3796 letters in 249 images harvested from Google Street View. The dataset is more challenging becaus... text, detection, recognition, classification, outdoor, urban link 2014-01-13 1393
92 ICDAR 2011 This challenge is set up around three tasks: Text Localisation, Text Segmentation and Word Recognition. Participation in any or all tasks is welcome. Check the ... text, detection, recognition, classification link 2016-06-01 1047
91 ICDAR 2003 The ICDAR 2003 datasets available for download on this site: Robust Reading , Robust Word Recognition , Robust OCR , Text Locating and Cursive Script . Pleas... text, detection, recognition, classification link 2018-05-16 1300
24 UIUC Cars This UIUC Cars dataset by Shivani Agarwal, Aatif Awan and Dan Roth contains images of side views of cars for use in evaluating object detection algorithms. The ... car, sideview, detection, scale, recognition, urban, scale link 2013-10-08 1406


total views: 91647 5 queries in 0.00013494491577148s 0.00017118453979492s 0.0002129077911377s 0.00013399124145508s 0.0028591156005859s and total 0.009674072265625s