This website provides a list of frequently used computer vision datasets. Wait, there is more!
There is also a description containing common problems, pitfalls and characteristics and now a searchable TAG cloud.
Plus, this is open for crowd editing (if you pass the ultimate turing test)! - Questions? yacvid [at] hayko [dot] at
Content, design and idea © by Hayko Riemenschneider, 2011-2024.
Texts and images are subject of copyright by the respective authors.
Hey! If you're reading this, why not help and update the description of the dataset you're working on?
Add a new dataset! Yay!
DID | Name | Description | Tags | URL | Date | Views |
---|---|---|---|---|---|---|
535 | CropAndWeed Dataset | The CropAndWeed dataset is focused on the fine-grained identification of 74 relevant crop and weed species with a strong emphasis on data variability. Annotatio... | detection segmentation classification agriculture crop weed benchmark | link | 2023-03-15 | 413 |
534 | UASOL, a large-scale high-resolution outdoor stereo dataset | Dataset for outdoor depth estimation from single and stereo RGB images. The dataset was acquired from the point of view of a pedestrian. Currently, the most nov... | stereo, depth estimation, gps, | link | 2023-02-07 | 232 |
533 | BraTS | BraTS has always been focusing on the evaluation of state-of-the-art methods for the segmentation of brain tumors in multimodal magnetic resonance imaging (MRI)... | brain tumor, segmentation | link | 2022-11-24 | 319 |
532 | GRASP multicam dataset | The GRASP MultiCam data set combines recorded images from a synchronized stereo monochrome camera and IMU with those from a depth sensor. The stereo camera / IM... | stereo synchronized VIO depth sensor | link | 2022-10-28 | 269 |
531 | SEN12MS-CR-TS | SEN12MS-CR-TS is a multi-modal and multi-temporal data set for cloud removal. It contains time-series of paired and co-registered Sentinel-1 and cloudy as well ... | satellite, multimodal, temporal, reconstruction, benchmark | link | 0000-00-00 | 324 |
530 | SEN12MS-CR | SEN12MS-CR is a multi-modal and mono-temporal data set for cloud removal. It contains observations covering 175 globally distributed Regions of Interest recorde... | satellite, multimodal, reconstruction, benchmark | link | 0000-00-00 | 280 |
529 | AIcrowd | # Goal The goal of this benchmark is to train models which can look at images of food items and detect the individual food items present in them. # Datase... | food dataset cv | link | 2022-03-01 | 448 |
528 | TrimBot2020 Public Dataset for Garden Navigation | The TrimBot2020 project researched the underlying robotics and vision technologies and prototypes the next generation of intelligent gardening consumer robots. ... | 3D reconstruction outdoor semantic natural garden robot | link | 2022-01-18 | 301 |
527 | University of Houston A Day on Campus (ADOC) | Detecting anomalies in videos is a complex problem with a myriad of applications in video surveillance. However, large and complex datasets that are representat... | Anomaly Detection, Surveillance Camera, Large Scale Surveillance | link | 2021-01-05 | 419 |
526 | Cube++ | Cube++ is a dataset collected for illumination estimation problem. It has 4890 raw 18-megapixel images, each containing a SpyderCube color target in their scene... | color constancy illumination estimation AWB raw | link | 2021-11-12 | 384 |
525 | ArCH dataset | Our proposed dataset, named ArCH – Architectural Cultural Heritage is composed of 17 annotated point clouds (135 million of labelled points), derived from th... | point clouds, cultural heritage, architecture, built heritage, semantic segmentation | link | 2020-12-03 | 406 |
524 | MaskedFace-Net | MaskedFace-Net is a dataset of face images with a correctly or incorrectly worn mask. Data paper: https://doi.org/10.13140/RG.2.2.20336.28165 ... | virus mask wearing behaviors face analysis covid public health education epidemiology | link | 2021-06-17 | 460 |
523 | Real-World Textured Things | Real World Textured Things (RWTT) is a collection of publicly available textured 3D models, generated with modern off-the-shelf photo-reconstruction tools. The ... | 3d-model texture textured-model parametrization reconstruction | link | 2020-10-19 | 420 |
522 | SUN Colonoscopy Video Database | SUN (Showa University and Nagoya University) Colonoscopy Video Database is designed for the evaluation of an automated colorectal-polyp detection. The database ... | polyp detection medical pathological label computer-aided-diagnosis | link | 2020-10-15 | 677 |
521 | Replica dataset | The Replica Dataset is a dataset of high quality reconstructions of a variety of indoor spaces. Each reconstruction has clean dense geometry, high resolution an... | 3d reconstruction indoor building semantic segmentation dense texture room | link | 2020-10-12 | 407 |
520 | SUN360 Panorama Database | The goal of the SUN360 panorama database is to provide academic researchers in computer vision, computer graphics and computational photography, cognition and n... | omnidirectional panorama 360 layout objects scene understanding | link | 2021-01-19 | 398 |
519 | SMALR | Animals are widespread in nature and the analysis of their shape and motion is important in many fields and industries. Modeling 3D animal shape, however, is di... | shape 3d mesh model quadruped animal pose bear lion cheetah cougar horse pig rhino tiger texture | link | 2020-09-21 | 1197 |
518 | CSAIL Shape Deformation Animals | The dataset about Shape Deformation of Animals consists of deformed meshes for horse, camel, lion,elephant, flamingo, face models. Every mesh is triangulated... | shape 3d mesh model quadruped animal deformation transfer horse camel lion elephant flamingo face | link | 2020-09-21 | 1089 |
517 | EPIC-Kitchens-100 | We will release EPIC-Kitchens-100, the extension of EPIC-Kitchens now including 100 hours, 700 videos, 90k action segments, 20k unique narrations, 45 kitchens, ... | action egocentric video benchmark kitchen cooking food activity daily worldwide | link | 2020-09-21 | 321 |
516 | 3D Ken Burns Dataset | A large collection of renderings from synthetic scenes with ground truth depth and normal maps. Each sample in the dataset hast four views that form a stereo vo... | multiview stereo depth normals | link | 2020-08-22 | 420 |
515 | GPA:geometric pose affordance dataset | Dataset of real 3D people interacting with real 3D scenes. 300k static RGB frames of 13 subject in 8 scenes with ground-truth scene meshes, and motion capture s... | 3d human pose estimation, human scene interaction, affordance, scene understanding | link | 2020-08-22 | 449 |
514 | University of Houston Camera Tampering Detection Dataset | An unauthorized or an accidental change in the view of a surveillance camera is called a tampering. UHCTD is a large scale synthetic dataset for camera tamperin... | Anomaly Detection, Surveillance Camera, Video Tampering Detection, Large Scale Surveillance | link | 2020-04-07 | 638 |
513 | P-DESTRE: Fully Annotated Datasets for Pedestrian Detection, Tracking, Re-Identification and Search | A large number of applications using unmanned aerial vehicle (UAV) sensors and platforms is being developed, for agriculture, logistics, recreational and milita... | Pedestrian Detection, Tracking, Re-identification, Search | link | 2020-04-07 | 673 |
512 | Chess Pieces | Bounding box labeled chess pieces taken from the side of the board at a 45 degree angle. Available for download in multiple annotation formats. Licensed as publ... | object detection, detection, chess, mobile, classification, recognition | link | 2020-02-26 | 597 |
511 | Mut1ny Face/head segmentation dataset | Face head/segmentation dataset contains over 21000 facial images with pixel wise segmentation annotation of eyes,nose,ears,mouth,teeth,hair,eyebrows,beard. It c... | face,segmentation,human | link | 2020-04-01 | 685 |
510 | SEN12MS | SEN12MS is a dataset consisting of 180,662 corresponding image triplets containing Sentinel-1 dual-pol SAR data, Sentinel-2 multi-spectral imagery, and MODIS-de... | link | 2019-10-30 | 511 | |
509 | Lost And Found dataset | The LostAndFound Dataset addresses the problem of detecting unexpected small obstacles on the road often caused by lost cargo. The dataset comprises 112 stereo... | segmentation 2d image groundtruth small object uncertrainty | link | 2020-10-30 | 619 |
508 | 3D60 Dataset | 3D60 is a collective dataset generated in the context of various 360 vision research works [1], [2], [3]. It comprises multi-modal stereo renders of scenes from... | 360 Omnidirectional Panorama Depth Normal Surface Scene | link | 2020-09-21 | 567 |
507 | 38-Cloud | Introduced for pixel-level cloud segmentation in satellite images. There are 8400 patches for training and 9201 patches for testing extracted from 38 Landsat 8 ... | Cloud, Satellite Images, Segmentation, Landsat 8, Remote Sensing, | link | 2019-10-15 | 563 |
506 | Cartoon Set 10K and 100K | Cartoon Set is a collection of random, 2D cartoon avatar images. The cartoons vary in 10 artwork categories, 4 color categories, and 4 proportion categories, wi... | cartoon avatar cross-domain image translation drawing gan google | link | 2019-09-27 | 624 |
505 | BigEarthNet | The BigEarthNet is a new large-scale Sentinel-2 benchmark archive, consisting of 590,326 Sentinel-2 image patches. The image patch size on the ground is 1.2 x 1... | aerial satellite earth semantic segmentation land cover europe multi-channel | link | 2019-09-17 | 547 |
504 | DublinCity | Urban Modelling Group at University College Dublin (UCD) captured major area of Dublin city centre (i.e. around 5.6 km^2 including partially covered areas) was ... | LIDAR | link | 2019-07-30 | 576 |
503 | Mid-Air [A multi-modal dataset for extremely low altitude drone flights] | Mid-Air, The Montefiore Institute Dataset of Aerial Images and Records, is a multi-purpose synthetic dataset for low altitude drone flights. It provides a large... | synthetic video drone UAV aerial deep learning depth semantic segmentation stereo normals groundtruth multi-sensor odometry SLAM localization | link | 2019-07-20 | 896 |
502 | Twitter100k | The Twitter100k is a A Real-world Dataset for Weakly Supervised Cross-Media Retrieval by Yuting Hu, Liang Zheng, Yi Yang, and Yongfeng Huang. This contribute... | twitter text image recognition sentence pair multimodal weakly supervision | link | 2019-06-19 | 776 |
501 | ICG Semantic Drone Dataset | The Semantic Drone Dataset focuses on semantic understanding of urban scenes for increasing the safety of autonomous drone flight and landing procedures. The im... | drone uav image tracking semantic annotation segmentation bounding box bbox groundtruth thermal temperature rgb 3d | link | 2020-04-11 | 1086 |
500 | senseFly drone datasets | Explore how senseFly drone solutions are employed around the globe - from topographic mapping and site surveys to stockpile monitoring, crop scouting, earthwork... | drone uav image field nature city urban park estate dam quarry village glacier lighthouse building | link | 2019-06-17 | 871 |
499 | MultiDrone Public DataSet | The public MultiDrone Dataset has been assembled using both pre-existing audiovisual material and newly filmed UAV shots. A large subset of these data has been ... | drone uav image tracking sport football rowing bicycle human crowd boat race | link | 2019-06-17 | 726 |
498 | highD - The Highway Drone Dataset | The Highway Drone Dataset consists of naturalistic trajectories of 110500 Vehicles Recorded at German Highways. The highD dataset is a new dataset of natural... | drone uav highway image vehicle tracking germany traffic pattern | link | 2020-05-26 | 1045 |
497 | Boxy | The Boxy vehicle detection dataset consists of 200,000 images with 1.99 million annotated vehicles.... | car vehicle detection | link | 2020-02-19 | 685 |
496 | Objects365 | Object detection is of significant value to the Computer Vision and Pattern Recognition communities as it is one of the fundamental vision problems. Therefore, ... | detection object category large-scale human benchmark | link | 2020-04-01 | 928 |
495 | Tampere University indoor dataset | Tampere University Indoor Dataset The TUT indoor dataset is a fully-labeled image dataset to facilitate the board use of image recognition and object detecti... | Deep learning, object detection, indoor dataset | link | 2019-11-28 | 912 |
494 | SYNTHIA | The SYNTHetic collection of Imagery and Annotations, is a dataset that has been generated with the purpose of aiding semantic segmentation and related scene un... | synthetic video object lighting weather condition depth season scene benchmark segmentation semantic detection | link | 2019-03-25 | 694 |
493 | Smartphone Image Denoising Dataset (SIDD) | Using this procedure, we have captured a dataset, the Smartphone Image Denoising Dataset (SIDD), of ~30,000 noisy images from 10 scenes under different lighting... | denoising noise real photograph image synthetic quality benchmark groundtruth | link | 2020-05-03 | 1365 |
492 | The Darmstadt Noise Dataset | Lacking realistic ground truth data, image denoising techniques are traditionally evaluated on images corrupted by synthesized i. i. d. Gaussian noise. This is... | denoising noise real photograph image synthetic quality benchmark | link | 2023-10-18 | 882 |
491 | Speech-driven 3D Facial Motion Database (S3DFM) | ANN: dynamic 2D/3D speaking face dataset with synchronized audio We would like to announce a new facial biometric dataset that has: * 1 second of 500 frame ... | speech motion face 3d recognition speaker | link | 2019-03-09 | 721 |
490 | ApolloScape | Advanced Open Tools and Datasets for Autonomous Driving ApolloScape , part of the Apollo project for autonomous driving, is a research-oriented project to ... | video segmentation pixel-level instance-level depth | link | 2019-03-08 | 826 |
489 | CADP: A Novel Dataset for CCTV Traffic Camera based Accident Analysis | Car Accident Detection and Prediction~(CADP) dataset consists of 1,416 video segments collected from YouTube, with 205 video segments have full spatio-temporal ... | Car Accident Detection, Accident Forecasting, CCTV analysis, Camera based accident analysis | link | 2020-05-07 | 1805 |
488 | HistAerial | HistAerial is a Large Scale Dataset for Historical Aerial Images Classification and Land Use recognition.... | historical aerial images land use land cover texture classification remote sensing reconstruction | link | 2019-02-15 | 838 |
487 | Bark-101 | Bark-101 is a challenging dataset made of 101 bark classes. It has been designed on top of the LifeCLEF challenge to evaluate both texture and bark recognition ... | bark texture classification tree recognition stem trunk | link | 2019-02-15 | 710 |
486 | CEC Vehicle Make and Model Recognition Dataset (VMMRdb) | Overview Despite the ongoing research and practical interests, car make and model analysis only attracts few attentions in the computer vision communi... | car vehicle automotive | link | 2021-01-21 | 1123 |
485 | Stanford Cars Dataset | Overview The Cars dataset contains 16,185 images of 196 classes of cars. The data is split into 8,144 training images and 8,041 testing images, where ea... | car | link | 2019-02-03 | 1280 |
484 | Flickr30k Entities | We propose to use the visual denotations of linguistic expressions (i.e. the set of images they describe) to define novel denotational similarity metrics, which... | phrase grounding caption text analysis image description flickr association video link | link | 2019-01-23 | 664 |
483 | CUHK DeepFashion2 | DeepFashion2 is a comprehensive fashion dataset. It contains 491K diverse images of 13 popular clothing categories from both commercial shopping stores and cons... | fashion apparel attributes recognition localization human benchmark polygon annotation instance semantic segmentation | link | 2019-08-18 | 1639 |
482 | CUHK DeepFashion | We contribute DeepFashion database, a large-scale clothes database, which has several appealing properties: First, DeepFashion contains over 800,000 diverse ... | fashion apparel attributes recognition localization human benchmark polygon annotation instance semantic segmentation | link | 2020-08-22 | 1254 |
481 | ModaNet and PaperDoll | ModaNet is a street fashion images dataset consisting of annotations related to RGB images. ModaNet provides multiple polygon annotations for each image. This d... | fashion apparel attributes recognition localization human benchmark polygon annotation instance semantic segmentation | link | 2019-01-09 | 1076 |
480 | FashionAI Apparel attributes recognition | A systematic and practical apparel attributes recognition dataset. We have constructed a hierarchical clothing label system via a professional arrangement of th... | fashion apparel attributes woman female recognition localization human retrieval | link | 2019-05-24 | 1115 |
479 | FashionAI Apparel keypoints | A professional dataset for apparel keypoints localization in practical scenarios. The definition of keypoints originates from apparel design principles. Current... | fashion apparel keypoint woman female recognition localization human | link | 2019-01-09 | 641 |
478 | UE4Sim and Sim4CV | Sim4CV is the general environment for simulating data for computer vision tasks, like object tracking, pose estimation, detection, action recognition, indoor sc... | synthetic object tracking, pose estimation, detection, action recognition, indoor scene understanding, multi-agent collaboration, autonomous navigation, 3d reconstruction, crowd understanding, urban scene understanding, human tracking, aerial surveying simulation environment 3d photo-realistic realism depth segmentation urban rgb render | link | 2020-08-22 | 1641 |
477 | House3D: A Rich and Realistic 3D Environment | House3D is a virtual 3D environment which consists of thousands of indoor scenes equipped with a diverse set of scene types, layouts and objects sourced from th... | house indoor simulation environment 3d photo-realistic realism depth segmentation urban rgb render | link | 2018-11-30 | 1026 |
476 | The Webcam Clip Art Dataset | This is a subset of the dataset introduced in the SIGGRAPH Asia 2009 paper, Webcam Clip Art: Appearance and Illuminant Transfer from Time-lapse Sequences. As... | webcam year illumination appearance time-lapse city urban weather | link | 2018-11-20 | 773 |
475 | MAE Dataset | The Multimodal Attribute Extraction (MAE) dataset is the first benchmark dataset for the task of multimodal attribute extraction. It is composed of mixed media ... | multimedia multimodal images text attribute recognition pair product search asset retrieval | link | 2018-11-20 | 637 |
474 | Open MIC | Open MIC (Open Museum Identification Challenge) contains photos of exhibits captured in 10 distinct exhibition spaces of several museums which showcase painting... | museum recognition identification benchmark exhibit image paintings, time, sculpture, glassware, ceramics, indigenous | link | 2019-01-09 | 694 |
473 | 2015 Dublin LiDAR | 2015 Aerial Laser and Photogrammetry Survey of Dublin City Collection Record This record serves as an index to a suite of high density, aerial remote sensing... | laser scan aerial flight urban city dublin pointcloud 3d lidar | link | 2018-10-10 | 711 |
472 | human3.6m | human3.6m dataset is one of the largest datasets for 3D human pose estimation. It consists of 3.6 million images featuring 11 actors performing 15 daily activ... | human pose estimation camera video 3d laser scan action actor body part mocap | link | 2019-11-18 | 1893 |
471 | CrowdFlower | /! Commercial annotation platform, not a publicly released dataset Our Human-in-the-Loop Machine Learning platform transforms unstructured text, image, audio, ... | dataset benchmark annotation | link | 2018-09-11 | 774 |
470 | MVOR | MVOR is a Multi-view Multi-person RGB-D Operating Room Dataset for 2D and 3D Human Pose Estimation We are pleased to announce the release of the MVOR datase... | medical clinical human annotation multiview pose estimation rgbd operation hospital | link | 2018-10-08 | 842 |
469 | ALASKA | ALASKA is the second contest on steganalysis ; after a fruitful first contest, called BOSS and organized in 2010, which give birth to the development of large f... | steganalysis steganography image recognition challenge forensics | link | 2018-09-07 | 765 |
468 | NewBarkTex | The BarkTex database includes six tree bark classes, with 68 images per class. To build the New BarkTex set, a region of interest, centered on the bark and whos... | Bark, texture, computer vision, classification | link | 2018-09-07 | 720 |
467 | BarkTex | This image database contains a collection of 408 color textures for the computer vision community. The pictures show the bark of six different European trees.... | Bark, texture, computer vision, classification | link | 2018-09-07 | 823 |
466 | Trunk12 | Since there is no known publicly available tree bark image data set, a new publicly available data set was created as a part of Bsc thesis. It contains about 36... | Bark, texture, computer vision, classification | link | 2018-09-07 | 685 |
465 | IMDb-Face | IMDb-Face is a new large-scale noise-controlled dataset for face recognition research. The dataset contains about 1.7 million faces, 59k identities, which is ma... | Face recognition | link | 2018-09-06 | 1053 |
464 | vegetation synthetic real | Data from: Intercomparison of photogrammetry software for three-dimensional vegetation modelling Probst A, Gatziolis D, Strigul N Date Published: June 12,... | vegetation synthetic real 3d model reconstruction software benchmark plant | link | 2018-08-09 | 704 |
463 | Families In The Wild (FIW) Database. | Families In The Wild (FIW) Database is the largest and most comprehensive database available for kinship recognition. FIW is made up of 11,932 natural family p... | recognition kinship family relationship dna similarity face | link | 2020-03-20 | 965 |
462 | Taskonomy | The Taskonomy dataset consists of 3.9 Mil. Scenes, 600 Buildings, 25 Tags per Image, 1024 Resolution for taxonomy and transfer learning tasks. We provide a larg... | transfer learning taxonomy task deep indoor 3d mesh pose camera high-resolution | link | 2018-08-08 | 792 |
461 | wilddash robust | The WildDash Benchmark provides a dataset and benchmark for semantic and instance segmentation. We aim to improve the expressiveness of performance evaluation f... | robust segmentation noise environment lighting fog semantic autonomous | link | 2018-06-26 | 885 |
460 | Exclusively-Dark-Image-Dataset | In order to facilitate a new object detection and image enhancement research, we introduce the Exclusively Dark (ExDark) dataset (CVIU - accepted). The Exclusiv... | object detection, low-light dark image enhancement | link | 2019-10-16 | 1633 |
459 | MVSEC | The Multi Vehicle Stereo Event Camera dataset is a collection of data designed for the development of novel 3D perception algorithms for event based cameras. St... | event camera speed intensity dynamic gps imu 3d benchmark | link | 2018-05-30 | 920 |
458 | WIDER Face and Pedestrian Challenge | Three datasets, WIDER Face, WIDER Pedestrian, and WIDER Person Search are released for this challenge in conjunction with ECCV 2018. Check the website for mo... | Face detection, pedestrian detection, person search | link | 2019-03-23 | 2192 |
457 | Topcoder FMOW - Functional Map of the World Challenge | Intelligence analysts, policy makers, and first responders around the world rely on geospatial land use data to inform crucial decisions about global defense an... | aerial segmentation semantic topcoder rgb satellite urban building | link | 2018-04-30 | 1072 |
456 | Kaggle dstl satellite | The proliferation of satellite imagery has given us a radically improved understanding of our planet. It has enabled us to better achieve everything from mobili... | aerial segmentation semantic kaggle multispectral rgb satellite panchromatic urban building | link | 2018-04-30 | 1352 |
455 | Darmstadt Noise Dataset | The Darmstadt Noise dataset provides a benchmark for denoising performance. Lacking realistic ground truth data, image denoising techniques are traditionally e... | noise denoising benchmark high-resolution groundtruth iso natural real | link | 2019-03-18 | 999 |
454 | SBM-RGBD Dataset | The SBM-RGBD dataset [provides] all facilities (data, ground truths, and evaluation scripts) in order to evaluate and compare scene background modelling metho... | background modeling rgbd kinect video color depth benchmark indoor surveillance | link | 2019-08-12 | 1294 |
453 | San Diego State University - Open Turbulent Image Set (OTIS) | Despite the existence of several turbulence mitigation algorithms in the literature, no common dataset exists to objectively evaluate their efficiency. This dat... | Image Sequence Atmospheric Turbulence Restoration Evaluation | link | 2018-04-16 | 758 |
452 | INRIA Praxis Gesture | PRAXIS GESTURE DATASET is a new challenging RGB-D upper-body gesture dataset recorded by Kinect v2. The dataset is unique in the sense that it addresses the Pra... | gesture rgbd body activity action kinect recognition taxonomy | link | 2018-04-16 | 794 |
451 | empty | empty... | empty | link | 2018-04-30 | 594 |
450 | empty | empty... | empty | n/a | 2018-04-30 | 586 |
449 | HOWTO Create Dataset | Our catalog of challenges for CV algorithms creates a basis for referencing criticalities by name and allows the calculation of criticality coverage. It is a si... | benchmark dataset howto | link | 2018-04-11 | 721 |
448 | EPIC-KITCHENS | EPIC-KITCHENS, is the largest egocentric video benchmark recorded by 32 participants in their native kitchen environments. Our videos depict non-scripted daily ... | action egocentric video benchmark kitchen cooking food activity daily worldwide | link | 2018-04-11 | 1097 |
447 | WIKI List | A list of machine learning datasets ... | benchmark dataset wiki aerial machine learning | link | 2018-04-19 | 837 |
446 | DAVIS: Densely Annotated VIdeo Segmentation 2016 | Dataset released with the CVPR 2016 paper. The videos contain several types of objects and humans with a high quality segmentation annotation. In each video seq... | object tracking segmentation video benchmark code hd quality resolution | link | 2018-04-05 | 1089 |
445 | Grouping Face in the Wild (GFW) Dataset | This is the largest real-world face clustering dataset. We used this dataset in our AAAI 2018 paper "Merge or Not? Learning to Group Faces via Imitation Learnin... | Face clustering | link | 2018-11-12 | 928 |
444 | Supervisely Person Dataset | The Supervisely Person Dataset consists of 5711 images with 6884 high-quality annotated person instances. All steps below are done inside Supervisely without a... | person pedestrian segmentation semantic mask supervisely annotation automatic dataset instance | link | 2022-04-19 | 4248 |
443 | ApolloScape Semantic Segmentation | The ApolloScape Parsing dataset is provided by Baidu for the CVPR 2018 Workshop on Autonomous Driving Challenge. It is expected that the Scene Parsing dataset ... | segmentation semantic scene benchmark size urban autonomous driving camera calibration video | link | 2019-03-08 | 1577 |
442 | YouTube Co-localization Dataset (ECCV + IEEE Trans. CSVT papers) [GEU and NTU] | The dataset consists of bounding box annotations for 15k frames of videos collected from YouTube Objects Dataset. If you find this dataset useful, kindly ci... | Co-localization Co-segmentation Co-saliency Video CATS Tracklet Benchmark Binary Object Retrieval Segmentation Semantic Similarity Tracking Matching Localization | link | 2018-03-21 | 1264 |
441 | Alzheimers Disease Neuroimaging Initiative (ADNI) | The Alzheimers Disease Neuroimaging Initiative (ADNI) data are shared without embargo through the LONI Image & Data Archive (IDA), a secure research data repos... | brain human medicine medical scan behavior disease | link | 2018-03-16 | 836 |
440 | Human Connectome Project (HCP) | The Human Connectome Project (HCP) has tackled one of the great scientific challenges of the 21st century: mapping the human brain, aiming to connect its struct... | brain human medicine medical scan lifespan adult age behavior | link | 2020-03-28 | 844 |
439 | Cornell Activity Datasets: CAD 60 & CAD 120 | The CAD-60 and CAD-120 data sets comprise of RGB-D video sequences of humans performing activities which are recording using the Microsoft Kinect sensor. CAD... | activity action affordance rgbd video daily human kinect | link | 2018-03-15 | 820 |
438 | CAD 120 affordance | This is the CAD 120 Affordance Segmentation Dataset based on the Cornell Activity Dataset CAD 120 (see http://pr.cs.cornell.edu/humanactivities/data.php). Co... | segmentation affordance action cad attribute human | link | 2018-03-15 | 819 |
437 | Alpert Objećt Instance Segmentation | To evaluate the segmentation produced by different algorithms we have compiled a database, currently containing 200 gray level images along with ground truth se... | object segmentation instance tool grayscale | n/a | 2018-11-20 | 750 |
436 | Aberystwyth Leaf Evaluation | We are releasing the Aberystwyth Leaf Evaluation dataset acquired to support the work of the EPSRC funded project Dynamic Modelling of Plant Growth with Comput... | segmentation leaf biology groundtruth time-lapse nature growth | link | 2019-01-09 | 772 |
435 | Shadow Detection/Texture Segmentation | The Shadow Detection/Texture Segmentation Computer Vision Dataset is focused around texture analysis, so each image sequence contains shadows moving in front o... | shadow segmentation texture detection artificial 3d virtual noise illumination | link | 2018-11-20 | 943 |
434 | Online RGBD Action Dataset (ORGBD) | The Online RGBD Action dataset targets for human aciton (human-object interaction) recognition based on RGBD video data. There are seven categories of human act... | action rgbd online human recognition daily | link | 2019-04-04 | 1010 |
433 | 20bn-Something-Something | The 20BN-SOMETHING-SOMETHING dataset is a large collection of densely-labeled video clips that show humans performing pre-defined basic actions with everyday ob... | action recognition human video daily | link | 2018-03-15 | 1015 |
432 | Collaborative 3D reconstruction with smartphones | collaborative 3d reconstruction with smartphones dataset: Six off-the-shelf Android smartphones captured video streams (Table 1, see below) of three cultural h... | collaborative 3d reconstruction smartphone image cloud video | link | 2018-03-15 | 734 |
431 | TextileTube dataset - University of Leon | Textile retrieval in real environments is a poorly investigated research field besides fashion cloths retrieval. Up to our knowledge, there is no publicly avail... | textile, retrieval, CBIR, MSER | link | 2018-03-12 | 739 |
430 | WildDash Benchmark | This website provides a dataset and benchmark for semantic and instance segmentation. We aim to improve the expressiveness of performance evaluation for compute... | semantic instance segmentation Cityscapes weather | link | 2018-03-09 | 909 |
429 | SYDNEY URBAN OBJECTS DATASET | This dataset contains a variety of common urban road objects scanned with a Velodyne HDL-64E LIDAR, collected in the CBD of Sydney, Australia. There are 631 ind... | URBAN Lidar road vehicle pedestrian sign tree city australia velodyne | link | 2018-11-20 | 1325 |
428 | BIM Schependomlaan | Dataset Schependomlaan All data owners have given permission to use the data for scientific and academic purposes. The data is gathered during the master the... | bim aerial urban reconstruction building project drone netherlands | link | 2018-02-20 | 1211 |
427 | CITY-OSM - ETH Zurich | # Learning Aerial Image Segmentation From Online Maps This is the ground truth data generated for the publication Learning Aerial Image Segmentation F... | semantic computer vision aerial image segmentation map geoscience remote sensing deep learning berlin chicaco paris potsdam tokyo zurich | link | 2018-01-25 | 1175 |
426 | UCL Motion Model Selection Dataset | The UCL Motion Model Selection Dataset contains videos in avi format, compressed with HuffYUV. They are separated into folders according to manual inspection-ba... | motion model real-world youtube video | link | 2018-01-10 | 837 |
425 | Architectural Style Facade | The Architectural Style Facade dataset is a 25-class dataset used for architectural style classification and regression. (The download link has been updated ... | classification architecture style facade urban regression | link | 2018-01-10 | 1958 |
424 | Automatic Image Cropping | The Automatic Image Cropping dataset contains ill-composed images with manual crops provided by qualified experts. As described in Section 2.1, our visual co... | image crop automatic aesthetics multimedia | link | 2019-09-16 | 945 |
423 | Makeup Induced Face Spoofing (MIFS) | The Makeup Induced Face Spoofing (MIFS) dataset consists of 107 makeup transformations taken from random YouTube makeup video tutorials. Each subject is attempt... | face recognition makeup illusion fake accuracy | link | 2019-12-17 | 1125 |
422 | Air-Ground-KITTI HD Maps | Air-Ground-KITTI dataset consist of annotated aerial and ground images used in the experiments is provided at downloads. Examples of the dataset. Left: aeria... | segmentation benchmark aerial urban satellite street road kitti hd | link | 2020-05-25 | 2441 |
421 | TUD Dynamic scenes dataset | The dynamic scenes dataset contains image sequences consisting of overall 1936 images. The images are taken from a camera inside a driving car and mainly show r... | semantic segmentation dynamic urban road street object driving crf | link | 2017-12-12 | 1233 |
420 | ATLAS Dione Robot-Assisted Surgery Video Understanding Dataset by Roswell Park Cancer Institute / Th | ATLAS Dione dataset provides video data (86 full subject study videos (~910 action clips)) of ten surgeons from Roswell Park Cancer Institute (RPCI) (Buffalo, N... | robotic surgery tool detection object detection action recognition expertise VOC video gesture | link | 2019-09-23 | 1143 |
419 | UC Merced Land Use Dataset | The UC Mercet dataset is a 21 class land use image dataset meant for research purposes. There are 100 images for each of the following classes: agricultural ... | semantic segmentation classification aerial land building urban | link | 2017-11-28 | 1806 |
418 | Udacity Annotated Driving Datasets | Udacity Annotated Driving Datasets have two datasets: Dataset 1 The dataset includes driving in Mountain View California and neighboring cities during dayli... | classification segmentation urban street selfdriving autonomous udacity annotation california city daylight | link | 2020-05-07 | 1734 |
417 | Visual Lip Reading Feasibility (VRLF) | The VLRF database is designed with the aim to contribute to research in visual only speech recognition. A key difference of the VLRF database with respect to ex... | lip reading recognition speaker spanish language mouth face speech | link | 2017-11-07 | 956 |
416 | CO-SKEL dataset (CVPR 17) [Graphic Era University and Nanyang Technological University] | This dataset has been developed to facilitate evaluation of Object Co-skeletonization problem introduced by the following paper: Object Co-skeletonization w... | skeleton, skeletonization, segmentation, co-segmentation, co-skeletonization, segmentation, saliency, pruning | link | 2018-03-20 | 1162 |
415 | Total Text Dataset | In order to facilitate a new text detection research, we introduce the Total-Text dataset, which is more comprehensive than the existing text datasets. The Tota... | text detection, text recognition, scene text detection | link | 2020-06-11 | 2772 |
414 | FashionGAN Dataset | New annotations (languages and segmentation maps) on the subset of the DeepFashion dataset. The data is used in the paper Be Your Own Prada: Fashion Synthes... | GAN, Fashion, Segmentation, Language | link | 2020-01-05 | 1372 |
413 | DPED: DSLR Photo Enhancement Dataset | We introduce a large-scale DPED dataset that consists of photos taken synchronously in the wild by three smartphones and one DSLR camera. The devices used to co... | dped image photo enhancement deep learning computer vision | link | 2017-10-24 | 1018 |
412 | MegaAge Dataset | We introduce a new large-scale MegaAge dataset that consists of 41,941 faces annotated with age posterior distributions. We also provide the MegaAge-Asian datas... | Face Analysis, Age Estimation | link | 2020-02-18 | 2700 |
411 | ISR-UoL 3D Social Activity Dataset | This is a social interaction dataset between two subjects. This dataset consists of RGB and depth images and tracked skeleton data (i.e. joints 3D coordinates a... | Social, Activity, Interaction, Human, Indoor, Skeleton, RGBD, ROS action | link | 2017-11-28 | 1031 |
410 | Charades Activity Dataset | 10,000 30sec videos from 267 volunteers, each annotated with multiple activities, captions, objects, and temporal localizations. From "Hollywood in Homes: Cr... | video activity recognition action object caption localization detection human daily | link | 2019-04-10 | 1034 |
409 | NII Okutama-Action: An Aerial View Video Dataset for Concurrent Human Action Detection | We present Okutama-Action, a new video dataset for aerial view concurrent human action detection. It consists of 43 minute-long fully-annotated sequences with 1... | action detection aerial view uav drone pedestrian multi-human tracking | link | 2017-09-20 | 2032 |
408 | PETS 2016 IPATCH dataset | The PETS 2016 IPATCH dataset contains a set of fourteen multi camera recordings (visible, themal) collected off the coast of Brest, France, in collaboration wit... | maritime vessel boat detection tracking thermal visible gps radar multimodal | link | 2017-09-16 | 1463 |
407 | Inria Aerial Image Labeling | The Inria Aerial Image Labeling addresses a core topic in remote sensing: the automatic pixelwise labeling of aerial imagery (link to paper). Dataset feature... | semantic segmentation aerial urban city groundtruth building footprint house | link | 2018-03-22 | 1535 |
406 | Swedish Traffic Sign Recognition | The Swedish Traffic Sign Recognition provides Matlab code for parsing the annotation files and displaying the results. Part0 for each set contains the annotated... | traffic sign recognition detection urban city | link | 2020-02-28 | 1434 |
405 | SydneyHouse HouseCraft | In HouseCraft, we utilize rental ads to create realistic textured 3D models of building exteriors. In particular, we exploit the address of the property and its... | house urban building city floorplan street semantic segmentation localization registration google sydney | link | 2018-01-28 | 1164 |
404 | Zurich Summer Dataset | The Zurich Summer v1.0 dataset is a collection of 20 chips (crops), taken from a QuickBird acquisition of the city of Zurich (Switzerland) in August 2002. Quick... | satellite segmentation semantic aerial urban city zurich pan nir rgb gsd superpixel annotation | link | 2017-09-12 | 1442 |
403 | Multispectral Imaging (MSI) | Multispectral Imaging (MSI) datasets were acquired using IRIS II which is a lightweight portable system comprising of a high resolution camera, a novel filter w... | multi-spectral illumination wavelength groundtruth registration matching alignment | link | 2022-01-05 | 1218 |
402 | GeoFaces | A large dataset of geotagged face images collected from Flickr. The zip file contains text files containing urls of the images. Face2GPS: Estimating Geograph... | face localization geotag classification gender age human | link | 2019-01-09 | 976 |
401 | Berkeley DeepDrive Video | The Berkeley DeepDrive Video Dataset contains 2x order of magnitude more video training data. Explore 100,000 HD video sequences of over 1,100-hour driving... | urban autonomous driving deep learning endtoend | link | 2018-06-26 | 1278 |
400 | Visual Discriminative Question Generation (VDQG) dataset | The dataset contains 11202 ambiguous image pairs collected from Visual Genome. Each image pair is annotated with 4.6 discriminative questions and 5.9 non-discri... | vision language VQA question genome biology | link | 2017-09-12 | 1263 |
399 | Osnabrück - Synthetic Scalable Cube Dataset | Voxel Based Dataset for Systematic 3D reconstruction by artificial neural networks (ANNs). A synthetic scalable cube dataset for training, testing and valida... | 3D, Deep Learning, Reconstruction, SfM, Synthetic city urban | link | 2018-02-13 | 1098 |
398 | Osnabrück - Gaze Tracking Data Set | Gaze data on video stimuli for computer vision and visual analytics. Converted 318 video sequences from several different gaze tracking data sets with polygo... | segmentation, gaze data, polygon annotation, video, metadata | link | 2018-02-13 | 1065 |
397 | MPI-I VISPR (Visual Privacy) | We present a dataset to address the problem of visual privacy - where users unintentionally leak private information when sharing personal images online, such a... | privacy multilabel classification flickr scene regression | link | 2018-04-13 | 1075 |
396 | ADE20k | Scene Parsing Benchmark Scene parsing data and part segmentation data derived from ADE20K dataset could be download from MIT Scene Parsing Benchmark. mages ... | segmentation semantic annotation benchmark scene recognition | link | 2017-08-03 | 1126 |
395 | AWS Public Datasets | AWS hosts a variety of public datasets that anyone can access for free. Previously, large datasets such as satellite imagery or genomic data have required hour... | amazon aerial classification deep learning segmentation recognition satellite human biology space image resolution | link | 2018-10-26 | 1700 |
394 | Matterport 2D-3D-Semantics Data | The 2D-3D-S dataset provides a variety of mutually registered modalities from 2D, 2.5D and 3D domains, with instance-level semantic and geometric annotations. I... | 3d panorama semantic segmentation depth normal indoor building reconstruction large-scale | link | 2017-07-27 | 1539 |
393 | ZuBuD+ | ZuBuD+, created in February 2017 by Federico Magliani (University of Parma), introduces many query images balancing the class evaluated from the previous datase... | landmark, building, image retrieval, urban | link | 2017-07-17 | 876 |
392 | M2CAI 2016 Challenge | These datasets were generated for the M2CAI challenges, a satellite event of MICCAI 2016 in Athens. Two datasets are available for two different challenges: m2c... | medicine video recognition surgery workflow challenge | link | 2017-07-11 | 1424 |
391 | xawAR16 | The xawAR16 dataset is a multi-RGBD camera dataset, generated inside an operating room (IHU Strasbourg), which was designed to evaluate tracking/relocalization ... | medicine video recognition surgery table operation depth | link | 2017-07-11 | 800 |
390 | Cholec80 | The Cholec80 dataset contains 80 videos of cholecystectomy surgeries performed by 13 surgeons. The videos are captured at 25 fps. The dataset is labeled with th... | medicine video recognition surgery phase tool | link | 2020-06-18 | 1136 |
389 | action recognition benchmark | We wanted to have a collection of action recognition papers and results that everybody can use for reference. The site will work by the community principle, so ... | action recognition benchmark dataset | link | 2017-07-11 | 831 |
388 | Open Images Dataset v4 new | Today, we introduce Open Images, a dataset consisting of ~9 million URLs to images that have been annotated with labels spanning over 6000 categories. We tried ... | classification large-scale category real image deep annotation automatic benchmark boundingbox | link | 2018-09-11 | 1233 |
387 | Edinburgh Ceilidh Overhead Video Data | This web page contains video data and ground truth for 16 dances with two different dance patterns. The style of dancing is inspired by Scottish Ceilidh dancing... | video dance chemistry pattern background motion analysis action | link | 2017-07-02 | 824 |
386 | Utrecht University, ShakeFive2 | ShakeFive2 A collection of 8 dyadic human interactions with accompanying skeleton metadata. The metadata is frame based xml data containing the skeleton join... | human interaction Kinect video | link | 2017-06-26 | 719 |
385 | WildLife Documentary (WLD) Dataset | The dataset contains 15 documentary films that are downloaded from YouTube, whose durations vary from 9 minutes to as long as 50 minutes, and the total number o... | Video object detection | link | 2017-06-23 | 984 |
384 | An RGB-D Dataset for 6D Pose Estimation of Texture-less Objects (T-LESS) | A dataset acquired with 3 synchronized sensors (Primesense Carmine 1.09, Microsoft Kinect v2, Canon IXUS 950 IS), featuring: * 30 industry-relevant objects:... | RGBD 3D pose texture-less object estimation | link | 2017-09-12 | 1146 |
383 | ... | n/a | 2020-08-22 | 591 | ||
382 | Portland State University, Blur image Dataset | The 2976 blurry images along with their corresponding ground-truth images and the blur kernels used to create them. All images are in gray-scale. This datas... | Blur, Kernels, Original2 | link | 2020-06-23 | 2401 |
381 | Facial Expression in-the-Wild (ExpW) Dataset | We built a new database named as Expression in-the-Wild (ExpW) dataset that contains 91,793 faces manually labeled with expressions. Each of the face images was... | Facial expression | link | 2019-11-29 | 2887 |
380 | CERTH Image Blur Dataset | The CERTH image blur dataset consists of 2450 digital images, 1850 out of which are photographs captured by various camera models in different shooting conditio... | blur motion defocus detection quality image | link | 2020-10-30 | 1613 |
379 | Crepe Cooking Dataset | The Crepe Dataset provides 6 different types of structured activity videos in 1920x1080 resolution. Each activity is represented as a sequence of different acti... | structured activity action recognition cooking recipe crepe simultaneous parallel | link | 2017-05-19 | 913 |
378 | Trololololo | Trololololo... | Trololololo | link | 2020-01-24 | 1469 |
377 | Lane Level Localization on a 3D Map | The Lane Level Localization dataset was collected on a highway in San Francisco with the following properties: * Reasonable traffic * Multiple lane highway ... | 3d map localization autonomous car driving gps benchmark video road | link | 2017-05-10 | 1207 |
376 | ScanNet | ScanNet is an RGB-D video dataset containing 2.5 million views in more than 1500 scans, annotated with 3D camera poses, surface reconstructions, and instance-le... | scene indoor synthetic cad room layout rendering realism 3d segmentation object recognition | link | 2017-05-12 | 1251 |
375 | SUNCG: Indoor Scenes | The SUNCG dataset is a Large 3D Model Repository for Indoor Scenes. SUNCG is an ongoing effort to establish a richly-annotated, large-scale dataset of 3D s... | scene indoor synthetic room layout rendering realism 3d segmentation object recognition | link | 2020-06-01 | 2751 |
374 | SceneNet RGB-D Synthetic Indoor | SceneNet RGB-D is dataset comprised of 5 million Photorealistic Images of Synthetic Indoor Trajectories with Ground Truth. It expands the previous work of Scene... | scene indoor synthetic robot navigation rendering 3d reconstruction trajectory lighting segmentation slam | link | 2017-05-02 | 1280 |
373 | DAVIS: Densely Annotated VIdeo Segmentation 2017 | We present the 2017 DAVIS Challenge, a public competition specifically designed for the task of video object segmentation. Following the footsteps of other succ... | object tracking segmentation video benchmark code hd quality resolution | link | 2018-04-05 | 1148 |
372 | VOT2016 segmentation | The VOT2016 pixel-wise annotations dataset contains pixel-wise per-frame annotations for sequences from VOT2016 dataset. The annotation is in a form of BW image... | object tracking segmentation mask annotation visual | link | 2017-04-17 | 1049 |
371 | ICS-FORTH MHAD101 Action Co-segmentation | This is a custom generated dataset designed for the task of action co-segmentation in pairs of action sequences. The dataset contains 101 pairs of action se... | action co-segmentation, temporal segmentation, motion capture data, time series | link | 2018-03-22 | 1104 |
370 | Pornography Dataset (NPDI/DCC/UFMG) | The Pornography database contains nearly 80 hours of 400 pornographic and 400 non-pornographic videos. For the pornographic class, we have browsed websites whic... | pornography, video, video shots, video frames88 | link | 2024-09-05 | 13933 |
369 | Nude Detection Dataset ? Images (NPDI/DCC/UFMG) | The database contains 180 images collected from the Web. If you make use of our database, please cite the following reference: LOPES, Ana; AVILA, Sandra... | nude detection, images | link | 2023-06-26 | 5836 |
368 | Nude Detection Dataset — Videos (NPDI/DCC/UFMG) | The database of nude and non-nude videos contains a collection of 179 video segments collected from the following movies: Alpha Dog, Basic Instinct, Before The ... | nude detection, video, movie | link | 2022-08-08 | 1772 |
367 | NUS Multi-Sensor Presentation (NUSMSP) Dataset | This dataset consist 51 oral presentation recorded with 2 ambient visual sensor (web-cam), 3 First Person View (FPV) cameras (1 on presenter and 2 on randomly c... | multi-sensor presentation analysis video kinect quality | link | 2017-09-12 | 1145 |
366 | Multi-Camera Action Dataset | An indoor action recognition dataset which consists of 18 classes performed by 20 individuals. Each action is individually performed for 8 times (4 daytime and ... | indoor video Multi-Camera Action Recognition Cross-View Recognition Open-View Recognition | link | 2017-09-12 | 1226 |
365 | Pedestrian Color Naming Dataset | Pedestrian Color Naming (PCN) dataset contains 14,213 images, each of which hand-lajkkjnjkjjnnj21312313113beled with color labekwladlkamdamşdamşdmal... | Pedestrian, segmentation, color naming | link | 2017-10-27 | 2097 |
364 | ETH CVL IMDB WIKI Faces | Since the publicly available face image datasets are often of small to medium size, rarely exceeding tens of thousands of images, and often without age informat... | face imdb wikipedia detection recognition age biometry | link | 2017-02-22 | 1021 |
363 | ETH/Yahoo Video2Gif dataset | The Video2GIF dataset contains over 100,000 pairs of GIFs and their source videos. The GIFs were collected from two popular GIF websites (makeagif.com, gifsoup.... | highlight video summarization gif summary scene understanding | link | 2017-09-12 | 1045 |
362 | LITIV Datasets | 3 datasets: PTZ Tracking, Thermal-visible registration, Single object tracking... | Tracking, PTZ, Thermal, Pedestrian | link | 2017-12-17 | 1772 |
361 | KAIST Multispectral Pedestrian Detection Benchmark | We developed imaging hardware consisting of a color camera, a thermal camera and a beam splitter to capture the aligned multispectral (RGB color + Thermal) imag... | pedestrian, thermal, RGB | link | 2015-11-09 | 1663 |
360 | ETHZ Multi-Person Tracking | Robust Multi-Person Tracking from Mobile Platforms In all cases, data was recorded using a pair of AVT Marlins F033C mounted on a chariot respectively a car,... | pedestrian, color, sequence, tracking | link | 2020-08-22 | 1833 |
359 | a | a... | segmentation | n/a | 2017-01-19 | 946 |
358 | BYU+VT Small Aircraft Flight Encounters initial dataset | A dataset of 11 encounters between two small Unmanned Aircraft Systems. The "host" UAS carries two stereo HD video cameras, a custom FM-CW radar, a PixHawk navi... | uas unmanned aircraft sense avoid stereo flying flight byu vt radar | link | 2017-01-14 | 1423 |
357 | udacity self-driving-car | At Udacity, we believe in democratizing education. How can we provide opportunity to everyone on the planet? We also believe in teaching really amazing and usef... | car robot driving autonomous street urban video recognition detection classification segmentation time synthetic | link | 2017-03-15 | 1722 |
356 | The Oxford RobotCar Dataset | The Oxford RobotCar Dataset contains over 100 repetitions of a consistent route through Oxford, UK, captured over a period of over a year. The dataset captures ... | car robot driving autonomous street urban video recognition detection classification segmentation time year | link | 2017-01-04 | 1871 |
355 | IMPART multi-modal/multi-view | The multi-modal/multi-view datasets are created in a cooperation between University of Surrey and Double Negative within the EU FP7 IMPART project. The sourc... | multiview multi-mode video rgbd lidar 3d model color indoor outdoor dynamic action face human emotion | link | 2019-03-16 | 1318 |
354 | Facial Expression Research Group Database (FERG-DB), University of Washington, Seattle | FERG-DB is a database of stylized characters with annotated facial expressions. The database contains multiple face images of six stylized characters. The chara... | Face, Facial expression, Animation, Stylization, annotation emotion, deep learning, anger, sad, joy, disgust, surprise, neutral, fear, cardinal classification, human transfer, image retrieval | link | 2019-12-01 | 2029 |
353 | COCO-Stuff | COCO-Stuff augments the COCO dataset with pixel-level stuff annotations for 10,000 images. These annotations can be used for scene understanding tasks like sema... | semantic segmentation stuff things COCO caption annotation groundtruth benchmark | link | 2019-01-09 | 1730 |
352 | 4D Light Field Dataset (HCI Heidelberg & CVIA Konstanz) | A synthetic light field dataset with 24 scenes. Data provided for each scene: - 9x9x512x512x3 light fields as individual PNGs - config files with camera s... | light field, ground truth, synthetic, disparity, depth | link | 2016-12-05 | 1556 |
351 | CMLA Subpixel Stereo Dataset | A 66 stereo pairs dataset with their subpixel ground truths. The construction and improvement of algorithms for subpixel stereovision requires very precise t... | stereo stereo vision subpixel groundtruth 3D pointcloud noise depth | link | 2020-05-17 | 1542 |
350 | The 2D Shape Structure Dataset | The 2D Shape Structure database is a public, user-generated dataset of 2D shape decompositions into a hierarchy of shape parts with geometric relationships reta... | 2d shape decomposition, 2d shape hierarchy, 2d shape structure, Medial axis | link | 2024-08-13 | 2985 |
349 | HKUST Ambiguity Dataset | This dataset contains two image collections, TempleOfHeaven and SportsArena, that are deemed hard for Structure-from-Motion (SfM). The method is described i... | Structure Motion, Ambiguous structure sfm | link | 2023-06-20 | 1442 |
348 | Global Symmetry AVA Dataset | Global Symmetry Ground-truth for AVA dataset Release Date: 2016 For detailed information, please refer to: Elawady, Mohamed, Cécile Barat, Christophe Du... | Global Bilateral Symmetry Detection Aesthetics Reflection Mirror | link | 2017-11-28 | 1089 |
347 | MOCAT (TUB Multi-Object and Multi-Camera Tracking Dataset) | The TU Berlin Multi-Object and Multi-Camera Tracking Dataset (MOCAT) is a synthetic dataset to train and test tracking and detection systems in a virtual world.... | synthetic tracking detection multi-class multiview evaluation pedestrian vehicle animal | link | 2019-10-29 | 2323 |
346 | LASIESTA (Labeled and Annotated Sequences for Integral Evaluation of SegmenTation Algorithms) | LASIESTA is composed by many real indoor and outdoor sequences organized in different categories, each of one covering a specific challenge in moving object det... | dataset groundtruth motion object detection foreground background subtraction challenge stationary camera | link | 2020-06-10 | 1305 |
345 | MMSE Heartrate | The MMSE heart rate dataset measures the visual heart rate from. faces by throwing darts at people. ... | face landmark emotion heart rate biology | n/a | 2019-04-30 | 1194 |
344 | Unimore - YACCLAB dataset | The YACCLAB dataset includes both synthetic and real binary images and is suitable for a wide range of applications, ranging from document processing to survail... | Labeling Binary Text Medical Fingerprint Video Surveillance Natural Random Noise | link | 2019-01-03 | 1369 |
343 | FIRE Fundus Image Registration Dataset | A benchmark dataset for the evaluation of retinal image registration methods is introduced. The dataset consists on 134 image pairs and is annotated with ground... | retina retinal image registration fundus eye | link | 2016-10-17 | 1253 |
342 | ICS-FORTH + Modelling of 2D Shapes with Ellipses | The dataset contains more than 4,536 2D shapes included in standard as well as in home-build datasets. Our goal is to represent a given 2D shape with an au... | shape ellipse fitting modeling 2d object classification | link | 2018-03-22 | 2213 |
341 | CVL OCR DB | CVL OCR DB is a public annotated image dataset of 120 binary annotated (text/non-text) images of text in natural scenes. Images include signboards, shop names, ... | OCR, sign recognition | link | 2016-10-13 | 881 |
340 | Ljubljana CVL Face Database | Database contains 798 images of 114 persons, with 7 images per person and is freely available for research purposes. All images were taken in supervised conditi... | face pedestrian person recognition biometry human illumination lighting | link | 2020-03-03 | 1933 |
339 | Annotated Web Ears Dataset (AWE Dataset) | Dataset contains 1000 images of 100 persons, with 10 images per person and is freely available. All images were acquired by cropping ears from images from the i... | ear biometry person pedestrian recognition human lighting test | link | 2020-05-21 | 1730 |
338 | MIT LaMem: Large-Scale Image Memorability Dataset | This database contains 60,000 images with memorability scores. The images come from a variety of datasets including SUN, COCO, image popularity, AVA, and severa... | memorability aesthetics object scene popularity | link | 2020-03-09 | 1592 |
337 | WIDER Attribute Dataset | WIDER ATTRIBUTE dataset is a human attribute recognition benchmark dataset, of which images are selected from the publicly available WIDER dataset. There are a ... | Attribute recognition, Human attribute | link | 2016-09-22 | 2778 |
336 | Procedural texture perceptual similarity | The procedural texture perceptual similarity dataset contains a list of procedural textures along with their pairwise distances, as defined by a perceptual stud... | texture procedural benchmark study | link | 2016-09-21 | 991 |
335 | General 100 | General-100 dataset contains 100 bmp-format images (with no compression). We used this dataset in our FSRCNN ECCV 2016 paper. The size of these 100 images range... | image superresolution | link | 2017-07-22 | 2473 |
334 | LabelMeFacade | The LabelMeFacade dataset contains buildings, windows, sky and a limited number of unlabeled regions (maximally 20% covering of the image). This procedure res... | segmentation semantic facade urban rectified recognition | link | 2016-08-23 | 1788 |
333 | UBC3V Dataset | UBC3V is a synthetic dataset for training and evaluation of single or multiview depth-based pose estimation techniques. The nature of the data is similar to the... | depth segmentation pose | link | 2016-08-18 | 1696 |
332 | Multi-FoV - Large Field-of-View Cameras for Visual Odometry | The Multi-FoV synthetic datasets are two synthetic scenes (vehicle moving in a city, and flying robot hovering in a confined room). For each scene, three differ... | visual odometry camera fov synthetic groundtruth blender | link | 2016-08-11 | 1472 |
331 | EuRoC MAV Dataset | This web page presents visual-inertial datasets collected on-board a Micro Aerial Vehicle (MAV). The datasets contain stereo images, synchronized IMU measuremen... | aerial vehicle, indoor, global shutter, slam | link | 2017-11-28 | 2194 |
330 | Cityscapes | We present a new large-scale dataset that contains a diverse set of stereo video sequences recorded in street scenes from 50 different cities, with high quality... | stereo video urban city semantic segmentation detection car person pedestrian weakly | link | 2018-03-22 | 2798 |
329 | Virginia Tech and Arab Academy for Science & Technology (VT-AAST) The VT-AAST Benchmarking Dataset | A New Color Image Database for Benchmarking of Face Detection Techniques and Human Skin Segmentation Techniques. A new color face image database for ... | face, detection, skin, segmentation, benchmarking, | link | 2016-07-11 | 1524 |
328 | UT Zappos50K | UT Zappos50K (UT-Zap50K) is a large shoe dataset consisting of 50,025 catalog images collected from Zappos.com. The images are divided into 4 major categories -... | fine-grained, ranking, local learning, pairwise comparison, shoe, attribute | link | 2018-09-11 | 1536 |
327 | PIROPO Database: People in Indoor ROoms with Perspective and Omnidirectional cameras | The PIROPO database (People in Indoor ROoms with Perspective and Omnidirectional cameras) comprises multiple sequences recorded in two different indoor rooms, u... | people surveillance perspective omnidirectional fisheye indoor room detection human | link | 2017-02-16 | 2193 |
326 | Desk3D (Cambridge University) | Instance recognition from depth data. Contains various challenges of Pose, Clutter, Occlusion and similar looking objects (Bonde, U., Badrinarayanan, V., & Cipo... | depth instance pose detection | link | 2020-04-29 | 1503 |
325 | Synthesized Inverse Synthetic Aperture Radar (ISAR) Images of Aircrafts | The database contains synthesized inverse synthetic aperture radar images of seven aircraft models. Reference: Hari Kishan Kondaveeti, Valli Kumari Va... | ISAR, image, classification | link | 2019-06-27 | 1594 |
324 | Historical Car Database | The database contains historical car images from 1920s to 1990s crawled from cardatabase.net. There are 10130 training and 3343 test images. Annotations incl... | Car, Recognition, Time | link | 2016-03-17 | 1512 |
323 | UT Egocentric (UT Ego) Dataset | The Univ. of Texas at Austin Egocentric (UT Ego) Dataset contains 4 videos captured from head-mounted cameras. Each video is about 3-5 hours long, captured in ... | First-person vision, egocentric | link | 2016-03-17 | 1204 |
322 | Kendall Square Webcam | The Kendall Square webcam dataset consists of two streams for one sunny day and one cloudy day of a city square. It is used for tracking and analyzing color cha... | webcam color weather change detection appearance sky | link | 2016-03-02 | 1688 |
321 | Webcam Interestingness | The Webcam Interestingness dataset consists of 20 different webcam streams, with 159 images each. It is annotated with interestingness ground truth, acquired in... | webcam interest classification retrieval ranking video weather | link | 2019-10-17 | 1425 |
320 | San Francisco Landmark Dataset for Mobile Landmark Recognition | The San Francisco Landmark Dataset for Mobile Landmark Recognition is a set of images and query images for localization. We present the San Francisco Landmar... | retrieval localization city urban sanfrancisco landmark calibration gps mobile | link | 2016-03-04 | 1554 |
319 | Visual Search Patches | The Compact Descriptors for Visual Search Patches Dataset (CDVS) is a dataset comprised of pairwise image patches. MPEG is a standard titled Compact Descriptor... | patch matching retrieval descriptor feature mpeg | link | 2016-02-11 | 1195 |
318 | Mouse Embryo Tracking Database | The database contains, for each of the 100 examples: (1) the uncompressed frames, up to the 10th frame after the appearance of the 8th cell; (2) a text file wit... | tracking cell circle biology mouse trajectory | link | 2016-02-11 | 1130 |
317 | NYU Symmetry Database | The mirror symmetry database contains 176 single-symmetry and 63 multyple-symmetry images (.png files) with accompanying ground-truth annotations (.mat files). ... | symmetry detection mirror groundtruth | link | 2016-04-15 | 1028 |
316 | Extreme Classification Repository | The Extreme Classification Repository: Multi-label Datasets & Code Kush Bhatia • Kunal Dahiya • Himanshu Jain • Yashoteja Prabhu • Manik Varma The... | machine learning multilabel classification benchmark evaluation | link | 2018-03-19 | 1866 |
315 | Geosemantic | The Geosemantic is a dataset of object locations from GIS and a query image with metadata. It is used to project the buildings and streets that are in the field... | semantic segmentation gps geography supervised gis | link | 2016-01-07 | 1175 |
314 | WIDER FACE: A Face Detection Benchmark | WIDER FACE dataset is a large-scale face detection benchmark dataset with 32,203 images and 393,703 face annotations, which have high degree of variabilities in... | face detection scale pose occlusion | link | 2019-07-22 | 2298 |
313 | Automotive Multi-sensor (AMUSE) | The automotive multi-sensor (AMUSE) dataset consists of inertial and other complementary sensor data combined with monocular, omnidirectional, high frame rate v... | street urban inertial video image traffic city api | link | 2017-11-28 | 1869 |
312 | University of Leon - Edge profile milling head tool data set | This data set comprises 144 images of an edge profile cutting head of a milling machine. The head tool contains a total of 30 cutting inserts. The cutting head ... | milling head tool inserts localization object cutting tool edge profile tool wear monitoring | link | 2018-11-20 | 1422 |
311 | ASL Datasets Repository | This site is dedicated to provide datasets for the Robotics community with the aim to facilitate result evaluations and comparisons. The datasets presented on t... | laser 3d urban nature city | link | 2015-10-28 | 1264 |
310 | FASSEG - FAce Semantic Segmentation | The FAce Semantic SEGmentation (FASSEG) repository contains datasets for multi-class semantic face segmentation. The FASSEG repository is composed by two dat... | face, segmentation | link | 2017-04-04 | 2700 |
309 | Coutour patches | The contour patches dataset is a large dataset of images patch matches used for contour detection. References: C. L. Zitnick and D. Parikh The Role of Im... | patch image match contour edge lowlevel detection segmentation | link | 2015-09-29 | 1332 |
308 | TST Intake Monitoring dataBase | t is composed of food intake movements, recorded with Kinect V1 (320?40 depth frame resolution), simulated by 35 volunteers for a total of 48 tests. The device ... | human food intake monitoring behavior kinect pointcloud tracking age groundtruth | link | 2018-01-06 | 1264 |
307 | HandNet annotated hand dataset | The HandNet dataset contains depth images of 10 participants hands non-rigidly deforming infront of a RealSense RGB-D camera. This dataset includes 214971 a... | hand articulation segmentation classification detection pose fingertip rgbd video | link | 2017-09-12 | 2613 |
306 | Shadow Removal Dataset and Online Benchmark for Variable Scene Categories (University of Bath, Bath) | To encourage the open comparison of single image shadow removal in community, we provide an online benchmark site and a dataset. Our quantitatively verified hig... | shadow removal benchmark illumination singleview | link | 2017-12-02 | 2061 |
305 | SPHERE human skeleton movements | The SPHERE human skeleton movements dataset was created using a Kinect camera, that measures distances and provides a depth map of the scene instead of the clas... | human action behavior motion movement video skeleton depth kinect | link | 2016-03-24 | 1570 |
304 | Ian Dworkin (McMaster University) | This is the database of biological images (from the genetics model system, Drosophila melanogaster, a fruit fly) across multiple levels of variation. we have... | biology genetic variation fly animal classification | link | 2016-02-11 | 1356 |
303 | 1DSfM Landmarks | The 1DSfM Landmarks is a collection of community-based image reconstruction by Kyle Wilson and is comprised of 14 datasets with comparison to bundler ground tru... | 3d reconstruction landmark groundtruth benchmark urban city | link | 2018-12-11 | 1537 |
302 | CMP map2photo | The CMP map2photo dataset consists of 6 pairs, where one image is satellite photo and second image is a map of the same area. The task is to match these images... | feature detection description matching map remote sensing wide baseline | link | 2015-08-13 | 1494 |
301 | CMP Extreme Zoom Dataset | The Extreme Zoom Dataset. EZD is a 6 image sets with incleasing zoom factor from general scene view to focusing on single detail. MODS: Fast and Robust Metho... | feature detection description matching viewpoint zoom | link | 2015-07-15 | 1336 |
300 | CMP WxBS dataset | The Wide (multiple) Baseline Dataset. 31 image pairs, simultaneously combining several nuisance factors: geometry, illumination, IR-visible, etc. WxBS: Wide ... | feature detection description matching viewpoint IR day night | link | 2015-07-15 | 2166 |
299 | CAMP-TUM: Multiple Human Pose Estimation from Multiple Views | We introduce the Shelf dataset for multiple human pose estimation from multiple views. In addition we annotate the body joints in the Campus dataset from CVLAB@... | 3D human pose estimation multiple view motion capture | link | 2015-07-15 | 1798 |
298 | Freiburg-Berkeley Motion Segmentation | The Freiburg-Berkeley Motion Segmentation Dataset (FBMS-59) is an extension of the BMS dataset with 33 additional video sequences. A total of 720 frames is anno... | video segmentation benchmark object tracking pedestrian groundtruth motion | link | 2017-03-21 | 2212 |
297 | Berkeley Video Segmentation | The Berkeley Video Segmentation Dataset (BVSD) contains videos for segmentation (boundary?) Dataset train Dataset test... | video segmentation benchmark | link | 2015-07-14 | 1763 |
296 | Video Segmentation Benchmark | The Video Segmentation Benchmark (VSB100) provides ground truth annotations for the Berkeley Video Dataset, which consists of 100 HD quality videos divided into... | video segmentation benchmark object tracking pedestrian groundtruth motion | link | 2018-11-12 | 2710 |
295 | Rent3D | The Rent3D dataset comprises floorplans and images. The goal of this work is to enable a 3D virtual-tour of an apartment given a small set of monocular images o... | indoor building reconstruction layout floorplan apartment urban | link | 2015-07-13 | 1569 |
294 | Happy People Images Database | Group emotion recognition in images - Happiness Intensity labels for group of people in images. The images have been collected from Flickr using keyword search ... | group, facial expression, emotion, wild, human, flickr, behavior | link | 2021-05-16 | 1710 |
293 | Google Street View Localization | The Google Street View dataset contains 62,058 high quality Google Street View images. The images cover the downtown and neighboring areas of Pittsburgh, PA; Or... | localization retrieval gps google street urban panorama pittsburgh address manhattan sphere | link | 2017-11-28 | 2016 |
292 | Mobile Phone and Webcam Hand Images for Personal Authentication and Identification | This work attempts to provide two Hand Images Databases for hand biometrics: one is created using a mobile phone camera of modest quality, which we called mob... | mobile webcam hand authentication Identification person biometric shape segmentation | link | 2015-11-09 | 2889 |
291 | MIT Places205 | Places205 dataase contains 2.5 million images from 205 scene categories for the academic public. The image dataset contains 2,448,873 images from 205 scene c... | place recognition urban scene feature learning | link | 2016-02-24 | 1839 |
290 | UWO GCO Volume Segmentation | The Western GCO Segmentation problem instances are provided to compare effects of graph size, neighborhood size, length of s to t paths, regional arc consistenc... | medical liver babyface bone abdomen adhead face segmentation binary optimization | link | 2015-06-19 | 1373 |
289 | ETHZ CVL Clust | MICCAI 2015 Challenge on Liver Ultrasound Tracking Munich, October 9, 2015 (Full Day) Outline Ultrasound (US) imaging is a widely used medical imaging techn... | medical liver tracking ultrasound therapy human organ benchmark real | link | 2015-06-19 | 1336 |
288 | Berkeley Urban Street tracking | The UrbanStreet dataset used in the paper can be downloaded here [188M] . It contains 18 stereo sequences of pedestrians taken from a stereo rig mounted on a ca... | tracking detection segmentation multitarget recognition video pedestrian urban human | link | 2019-02-21 | 2683 |
287 | INRIA Lafarge Benchmarks | Some datasets and evaluation tools are provided on this page for four different computer vision and computer graphics problems. Population counting Line-ne... | 3d surface reconstruction groundtruth pointcloud object detection line road network urban crowd pedestrian counting | link | 2020-04-28 | 1977 |
286 | HDA Person Dataset - ISR Lisbon | The High Definition Analytics (HDA) dataset is a multi-camera High-Resolution image sequence dataset for research on High-Definition surveillance: Pedestrian De... | Video Surveillance Pedestrian Detection Re-Identification Multiview Tracking Benchmark Indoor High-Definition Camera Network lisbon human | link | 2020-03-17 | 4696 |
285 | ISPRS-EuroSDR Multi-Platform | ISPRS / EuroSDR Benchmark for Multi-Platform Photogrammetry In these pages you can get information about the BENCHMARK FOR MULTI-PLATFORM PHOTOGRAMMETRY unde... | aerial multiview 3d photogrammetry germany switzerland urban city benchmark reconstruction | link | 2015-06-16 | 1510 |
284 | TRANCOS Overlapping Car Crowds | The TRaffic ANd COngestionS (TRANCOS) dataset, a novel benchmark for (extremely overlapping) vehicle counting in traffic congestion situations. It consists of 1... | object detection car transportation vehicle highway urban spain traffic | link | 2015-06-16 | 2170 |
283 | ISPRS WG III/4 | ISPRS Test Project on Urban Classification, 3D Building Reconstruction and Semantic Labeling. In this part of our working group site you will get further inform... | aerial multiview 3d photogrammetry germany canada semantic segmentation urban city recognition benchmark | link | 2015-06-16 | 1775 |
282 | ISPRS-EuroSDR HighDensity | ISPRS and EuroSDR - Benchmark on High Density Aerial Image Matching Background and Scope of the project Innovations in matching algorithms as well as the... | aerial multiview 3d photogrammetry germany switzerland urban city benchmark reconstruction | link | 2015-06-16 | 1526 |
281 | Tuberculosis image and patient data | Permanently growing database on lung tuberculosis patients. The data include radiological images (CT+XRay) plus social, clinical, and lab data as well as full g... | chest xray CT tuberculosis genome medical segmentation | link | 2020-01-07 | 2034 |
280 | Yahoo Flickr Creative Commons 100M | Yahoo Flickr Creative Commons 100M (YFCC100M) dataset contains a list of photos and videos. This list is compiled from data available on Yahoo! Flickr. All the ... | flickr landmark image recognition detection reconstruction 3d clustering social community internet | link | 2015-09-24 | 2292 |
279 | WWW Crowd | The Where Who Why (WWW) dataset provides 10,000 videos with over 8 million frames from 8,257 diverse scenes, therefore offering a superior comprehensive dataset... | surveillance crowd pedestrian detection recognition flow optical video | link | 2020-06-09 | 2447 |
278 | Comprehensive Cars (CompCars) | The Comprehensive Cars (CompCars) dataset contains data from two scenarios, including images from web-nature and surveillance-nature. The web-nature data contai... | car vehicle recognition attribute classification fine-grained urban object | link | 2019-09-28 | 3386 |
277 | Detail 2D Projection DataSet | Detail 2D Projection DataSet is a database of 2d projections of mechanical details with holes. The dataset consists of 13 shape categories where each category i... | shape, holes, detail, binary, matching, retrieval | link | 2015-05-10 | 2187 |
276 | TST TUG (Timed Up and Go) | The TUG (Timed Up and Go test) dataset consists of actions performed three times by 20 volunteers. The people involved in the test are aged between 22 and 39, w... | action recognition time kinect wearable accelerometer human video | link | 2015-05-02 | 1334 |
275 | TST fall detection | It is composed of ADL (activity daily living) and fall actions simulated by 11 volunteers. The people involved in the test are aged between 22 and 39, with diff... | action recognition detection depth kinect wearable accelerometer human video | link | 2017-03-14 | 1892 |
274 | UBO 2014 Materials | The UBO 2014 consists of 7 semantic categories. Each of these 7 material categories contains measurements of 12 different material instances for being capable t... | material light illumination texture classification recognition | link | 2018-03-10 | 1292 |
273 | SBMI 2015 | Scene Background Initialization (SBI) dataset The SBI dataset has been assembled in order to evaluate and compare the results of background initialization al... | change detection background initialization foreground benchmark | link | 2015-05-02 | 1417 |
272 | Stanford 40 Actions | The Stanford 40 Actions dataset contains images of humans performing 40 actions. In each image, we provide a bounding box of the person who is performing the ac... | human action recognition detection boundingbox | link | 2015-06-19 | 2385 |
271 | Labeling in 3D Scenes | This dataset package contains the software and data used for Detection-based Object Labeling on the RGB-D Scenes Dataset as implemented in the paper: Detecti... | 3d kinect reconstruction indoor depth object recognition | link | 2015-03-16 | 1606 |
270 | B3DO: Berkeley 3D Object Dataset | For the first few decades of the fields existence, computer vision has been focused on algorithmic, logical approaches to perception. But it was only with the a... | 3d kinect reconstruction indoor depth object recognition | link | 2020-02-25 | 1578 |
269 | Daimler Urban Segmentation Dataset | The Daimler Urban Segmentation Dataset consists of video sequences recorded in urban traffic. The dataset consists of 5000 rectified stereo image pairs with a r... | semantic segmentation outdoor urban stereo motion | link | 2015-06-26 | 2656 |
268 | HUJI Multi-illuminant Image Sequences dataset | The Multi-illuminant Image Sequences dataset contains 16 video sequences (13 with single light source and 3 with two global light sources), recorded with a HD ... | illumination nature physics dichromatic light chromaticity color constancy white balance object | link | 2015-02-20 | 1414 |
267 | 3DVis | The 3DVis dataset includes a set of 12 heterogeneous scenes for testing 3D scene registration and analysis methods. Models include homogeneous shapes, repetitiv... | 3d reconstruction matching registration shape symmetry | link | 2015-01-26 | 2292 |
266 | Paris Art Deco Facades | The Paris Art Deco Facades dataset consists of 79 / 80 images of rectified facades of the architectural style Art Deco, which has different sizes of windows, de... | paris semantic segmentation recognition architecture facade urban city procedural grammar | link | 2015-01-20 | 1634 |
265 | Salient Montages: Human-centric Video Summarization | The Salient Montages is a human-centric video summarization dataset from the paper [1]. In [1], we present a novel method to generate salient montages from u... | video summarization montage saliency wearable human | link | 2015-05-02 | 1403 |
264 | Domain-specific Personal Videos Highlight Dataset | The domain-specific personal videos highlight dataset from the paper [1] describes a fully automatic method to train domain-specific highlight ranker for raw p... | video summarization saliency wearable human action recognition domain | link | 2015-05-02 | 1619 |
263 | Crowd Dataset | The crowd datasets are collected from a variety of sources, such as UCF and data-driven crowd datasets. The sequences are diverse, representing dense crowd in t... | crowd video detection anomaly scene understanding human pedestrian | link | 2017-09-19 | 3493 |
262 | PHOS (Evaluating illumination invariance) | Phos is a color image database of 15 scenes captured under different illumination conditions. Every scene of the database contains 15 different images: 9 images... | illumination invariance, real lighting condition, uneven illumination, shadow, feature detection | link | 2018-03-22 | 1507 |
261 | MPI Multi-View Collection GVV datasets | Welcome to the homepage of the gvvperfcapeva datasets. This site serves as a hub to access a wide range of datasets that have been created for projects of the G... | video multiview tracking face mesh reconstruction depth human action pose | link | 2014-12-10 | 1644 |
260 | Eurasian Cities dataset | The Eurasian Cities dataset contains 103 images of outdoor urban scenes taken in Eurasian cities. It is annotated with horizontal and vertical vanishing points ... | vanishing line point geometry pose urban reconstruction outdoor manhattan | link | 2018-01-11 | 2115 |
259 | MOT Challenge 2D and 3D | The MOT Challenge is a framework for the fair evaluation of multiple people tracking algorithms. In this framework we provide: - A large collection of datase... | 3d tracking multiple target benchmark dataset people pedestrian surveillance video | link | 2019-09-26 | 2585 |
258 | Visual Attributes dataset | The Visual Attributes dataset contains visual attribute annotations for over 500 object classes (animate and inanimate) which are all represented in ImageNet. E... | classification recognition attribute imagenet object | link | 2020-05-18 | 1910 |
257 | FaceScrub | The FaceScrub dataset comprises a total of 107818 unconstrained face images of 530 celebrities crawled from the Internet, with about 200 images per person. M... | face detection recognition celebrity people human | link | 2018-06-30 | 1952 |
256 | Multi-Task Facial Landmark (MTFL) dataset | This dataset contains 12,995 face images which are annotated with (1) five facial landmarks, (2) attributes of gender, smiling, wearing glasses, and head pose. ... | face, landmark detection, deep learning, cnn, attribute | link | 2015-11-07 | 3327 |
255 | Robotic 3D Scan Repository | The Robotic 3D Scan Repository from Osnabrueck contains 23 different datasets showing a veriaty of 3D scans for objects, humans, cities, university campus, heat... | 3d reconstruction scan laser heat urban city human aerial germany bremen lidar osnabrueck | link | 2015-04-10 | 2061 |
254 | ChokePoint Dataset | We collected a video dataset, termed ChokePoint, designed for experiments in person identification/verification under real-world surveillance conditions using e... | human pedestrian identification recognition multiview sequence face detection real world surveillance clustering | link | 2015-05-02 | 2638 |
253 | Street View House Number (SVHN) | SVHN is a real-world image dataset for developing machine learning and object recognition algorithms with minimal requirement on data preprocessing and formatti... | street number recognition classification urban detection text real world | link | 2017-11-28 | 1749 |
252 | Volleyball Activity Dataset 2014 | This dataset contains 7 challenging volleyball activity classes annotated in 6 videos from professionals in the Austrian Volley League (season 2011/12). A total... | action activity sport volleyball detection recognition video analysis | link | 2020-06-15 | 3165 |
251 | ETHZ CVL RueMonge 2014 | This ETHZ CVL RueMonge 2014 dataset used for 3D reconstruction and semantic mesh labelling for urban scene understanding. It was first published in [1] and p... | semantic segmentation 3d reconstruction architecture paris benchmark source code urban recognition classification outdoor pointcloud mesh | link | 2014-11-24 | 3431 |
250 | ... | n/a | 2020-08-22 | 1569 | ||
249 | Image Sequence Analysis Test Site (EISATS) | The .enpeda.. Image Sequence Analysis Test Site (EISATS) offers sets of long bi- or trinocular image sequences recorded in the context of vision-based driver as... | stereo vision optical flow motion analysis semantic segmentation | link | 2014-09-30 | 2129 |
248 | VIDEO datasets overview | Many different labeled video datasets have been collected over the past few years, but it is hard to compare them at a glance. So we have created a handy spread... | video benchmark recognition classification detection object action | link | 2018-04-23 | 2018 |
247 | PASCAL VOC Parts | The PASCAL VOC is augmented with segmentation annotation for semantic parts of objects. For example, for the person category, we provide segmentation mask for 2... | detection recognition pascal object part pedestrian human segmentation semantic | link | 2014-09-30 | 2441 |
246 | Bristol Egocentric Object Interactions Dataset | The BEOID dataset includes object interactions ranging from preparing a coffee to operating a weight lifting machine and opening a door. The dataset is recorded... | video interaction object egocentric pose 3d tracking | link | 2017-09-12 | 2083 |
245 | ETHZ CVL Video SumMe | The Video Summarization (SumMe) dataset consists of 25 videos, each annotated with at least 15 human summaries (390 in total). The data consists of videos, anno... | video summary benchmark human groundtruth action event | link | 2020-04-23 | 4852 |
244 | Pedestrian Parsing on Surveillance Scenes (PPSS) dataset | The Pedestrian Parsing dataset contains 3,673 images from 171 videos of different Surveillance Scenes (PPSS), where 2,064 images are occluded and 1,609 are not.... | Pedestrian, Parsing, Segmentation | link | 2020-05-18 | 3310 |
243 | PEdesTrian Attribute (PETA) dataset | A new large-scale PEdesTrian Attribute (PETA) dataset. The dataset is by far the largest of its kind, covering more than 60 attributes on 19000 images. In comp... | Pedestrian, crowd | link | 2019-10-14 | 5239 |
242 | Stanford Dogs Dataset | The Stanford Dogs dataset contains images of 120 breeds of dogs from around the world. This dataset has been built using images and annotation from ImageNet for... | classification, detection, fine-grained categorization, dogs | link | 2015-07-29 | 3145 |
241 | Malaya Abrupt Motion (MAMo) Dataset | The Malaya Abrupt Motion (MAMo) dataset is targeted for visual tracking, particularly for abrupt motion tracking. It was collected from publicly accessible data... | visual tracking, abrupt motion tracking | link | 2016-11-05 | 1879 |
240 | Microsoft COCO | The Microsoft COCO (mscoco) is an image recognition and segmentation dataset which contains more 300k images for more than 70 categories. Other features: Mo... | object context segmentation detection recognition benchmark semantic | link | 2015-05-02 | 2559 |
239 | CUHK crowd dataset | CUHK crowd dataset introduces the largest publicly available crowd dataset of 474 videos from 215 crowded scenes. It has been used in the paper: Scene-Ind... | crowd analysis group detection analysis scene understanding dataset | link | 2020-05-03 | 2954 |
238 | Multiple Foreground Video Co-segmentation | The multiple foreground video co-segmentation dataset, consisting of four sets, each with a video pair and two foreground objects in common. The dataset includ... | video co-segmentation | link | 2014-08-14 | 1568 |
237 | MOViCS video co-segmentation dataset | The video co-segmentation dataset contains 4 video sets which totally has 11 videos with 5 frames of each video labeled with the pixel-level ground-truth.... | video co-segmentation dataset | link | 2018-04-23 | 1895 |
236 | iCoseg dataset | iCoseg dataset introduces the largest publicly available co-segmentation dataset of 38 groups (643 images), along with pixel ground-truth hand annotations.... | image co-segmentation | link | 2019-11-30 | 2029 |
235 | Kindergarten Video Surveillance | The dataset consist of the about 50 hours obtained from kindergarten surveillance videos. Dataset, totally approximately 100 videos sequences (1000GB, 50 hours)... | human action behavior segmentation video background surveillance | link | 2020-01-28 | 2824 |
234 | UMD Dynamic Scene Recognition | The UMD Dynamic Scene Recognition dataset consists of 13 classes and 10 videos per class and is used to classify dynamic scenes. The dataset has been describ... | scene recognition classification dynamic video motion | link | 2017-01-05 | 1762 |
233 | PASCAL Context | We would like to announce the release of PASCAL-Context dataset. We augmented PASCAL VOC 2010 dataset with annotations for 400+ additional categories. In the cu... | semantic segmentation pascal benchmark category recognition dense shape | link | 2014-07-17 | 3019 |
232 | Pratheepan Human Skin Detection Dataset | The images in this dataset are downloaded randomly from Google for human skin detection research. It has been used in the paper: W.R. Tan, C.S. Chan, Y. Prathee... | skin detection, skin segmentation, human detection, skin dataset | link | 2019-08-29 | 4923 |
231 | -- | ... | n/a | 2016-03-04 | 2094 | |
230 | FGVC-Aircraft | Fine-Grained Visual Classification of Aircraft (FGVC-Aircraft) is a benchmark dataset for the fine grained visual categorization of aircraft. Data, annotatio... | fine-grained classification recognition benchmark evaluation aircraft airplane | link | 2020-01-13 | 4014 |
229 | Paris Rue Madame | Paris-rue-Madame dataset contains 3D Mobile Laser Scanning (MLS) data from rue Madame, a street in the 6th Parisian district (France). The test zone contains ap... | semantic segmentation pointcloud 3d laser classification | link | 2014-06-10 | 1594 |
228 | MPI VehicleScenes | Abstract Scene understanding has (again) become a focus of computer vision research, leveraging advances in detection, context modeling, and tracking. In thi... | semantic segmentation scene understanding classification 3d car pedestrian | link | 2014-06-10 | 2453 |
227 | Omnidirectional and panoramic image dataset | We share our omnidirectional and panoramic image dataset (with annotations) to be used for human and car detection. Please reach through: http://cvrg.iyte.edu.... | panorama detection car omnidirection human recognition | link | 2017-01-13 | 2839 |
226 | Fish4Knowledge | The Fish4Knowledge project (groups.inf.ed.ac.uk/f4k/) is pleased to announce the availability of 2 subsets of our tropical coral reef fish video and extracted... | classification animal fish video motion nature recognition water camera | link | 2014-05-15 | 2105 |
225 | California-ND | An Annotated Dataset For Near-Duplicate Detection In Personal Photo Collections Managing photo collections involves a variety of image quality assessment tas... | retrieval duplicate copyright groundtruth detection | link | 2014-03-19 | 1397 |
224 | CMP Extreme View Dataset | 15 wide baseline stereo image pairs with large viewpoint change, provided ground truth homographies. Image size (~1000x700 pixels, RGB) D. Mishkin and M. ... | feature detection description matching viewpoint | link | 2015-07-15 | 1724 |
223 | SHOT 3D shape description | The 3D shape description dataset consists of multiple sub-datasets Descriptor Matching - Dataset 1 & 2 (Stanford) These datasets, created from some of the m... | 3d shape description benchmark reconstruction registration matching | link | 2015-06-21 | 2553 |
222 | Ford Car Dataset | The Ford Car dataset is joint effort of Pandey et al. (for collecting images, Lidar points, calibration etc.) and us (for annotation of 2D and 3D objects). ... | car detection lidar 3d groundtruth sfm | link | 2014-04-16 | 3366 |
221 | EPFL Multi-View Cars | Th EPFL Multi-View Car dataset contains 20 sequences of cars as they rotate by 360 degrees. There is one image approximately every 3-4 degrees. Using the time o... | pose multiview car detection estimation rotation | link | 2014-02-10 | 1949 |
220 | 3D Mask Attack Dataset | The 3D Mask Attack Database (3DMAD) is a biometric (face) spoofing database. It currently contains 76500 frames of 17 persons, recorded using Kinect for both re... | 3d biometry face recognition segmentation frontview emotion | link | 2020-06-19 | 2251 |
219 | JPL First-Person Interaction | JPL First-Person Interaction dataset (JPL-Interaction dataset) is composed of human activity videos taken from a first-person viewpoint. The dataset particularl... | video action recognition interactive motion human | link | 2014-02-03 | 1377 |
218 | VidPairs | The VidPairs dataset contains 133 pairs of images, taken from 1080p HD (~2 megapixel) official movie trailers. Each pair consists of images of the same scene wi... | video pair matching patch description flow dense optical | link | 2015-06-19 | 1494 |
217 | Youtube-Objects dataset | The YouTube-Objects dataset is composed of videos collected from YouTube by querying for the names of 10 object classes. It contains between 9 and 24 videos for... | video object detection segmentation flow optical | link | 2019-07-16 | 2365 |
216 | CVC Partial Occlusion Virtual Pedestrian | The CVC Partial Occlusion Virtual Pedestrian datasets (CVC-01 to CVC-06) cover a range of scenarios of occluded pedestrians generated in a virtual and real envi... | detection classification tracking pedestrian synthetic urban occlusion | link | 2016-03-15 | 2465 |
215 | WILD -Weather and Illumination Database | The Weather and Illumination Database (WILD) is an extensive database of high quality images of an outdoor urban scene, acquired every hour over all seasons. It... | webcam light illumination camera video static change urban time depth estimation weather newyork | link | 2016-04-19 | 3205 |
214 | The Webcam Clip Art Dataset | This is a subset of the dataset introduced in the SIGGRAPH Asia 2009 paper, Webcam Clip Art: Appearance and Illuminant Transfer from Time-lapse Sequences. As... | webcam light illumination camera video static change urban nature time | link | 2020-03-08 | 1760 |
213 | ChairGest Gestures | ChairGest is an open challenge / benchmark. The task consists in spotting and recognizing gestures from multiple synchronized sensors: 1 Kinect and 4 Xsens Ine... | benchmark recognition kinect gesture detection human | link | 2014-06-06 | 1415 |
212 | Polo Instance Segmentation | The Polo instance segmentation dataset is a semantic segmentation task for Hough transform based segmentation masks. It consists of supervised segmentation for ... | semantic segmentation horse human outdoor mask scene understanding | n/a | 2016-01-21 | 1910 |
211 | POSTECH Labeled Faces in the Wild | POS Labeled Faces in the Wild, a collection of face which is proposed for studying face identification in unconstrained environment, its purpose is serving as a... | face identification wild recognition registration | link | 2019-10-17 | 2121 |
210 | Traffic Video dataset | The Traffic Video dataset consists of X video of an overhead camera showing a street crossing with multiple traffic scenarios. The dataset can be downloaded ... | urban traffic tracking detection overhead view road video | link | 2020-01-24 | 6354 |
209 | Symmetry Set | The Symmetry set dataset is a collection of images at different illuminations for the purpose of image matching using local symmetry features. Image Matching... | symmetry matching feature image illumination lighting urban building | link | 2017-05-03 | 1785 |
208 | Landmark 1000 | The Landmark 1000 or 1k dataset is a collection of the top 1000 popular flickr landmarks mined from flickr. It is maintained by Noah Snavely and published in... | landmark 3d reconstruction pose estimation pointcloud world location | link | 2019-04-06 | 1941 |
207 | CASIA Gait Recognition Dataset | Dataset A (former NLPR Gait Database) was created on Dec. 10, 2001, including 20 persons. Each person has 12 image sequences, 4 sequences for each of the three ... | gait recognition biometry action classification motion human foot pressure | link | 2017-03-10 | 4881 |
206 | GaTech VideoContext | The GaTech VideoContext dataset consists of over 100 groundtruth annotated outdoor videos with over 20000 frames for the task of geometric context evaluation i... | video geometry context classification semantic segmentation unsupervised supervised outdoor urban nature | link | 2018-12-07 | 1829 |
205 | GaTech VideoStab | The GaTech VideoStab dataset consists of N videos for the task of video stabilization. This code is implemented in Youtube video editor for stabilization. ... | video stabilization camera path | link | 2013-10-09 | 1666 |
204 | UCF Person and Car VideoSeg | The UCF Person and Car VideoSeg dataset consists of six videos with groundtruth for video object segmentation. Surfing, jumping, skiing, sliding, big car, sm... | video segmentation object motion model camera groundtruth | link | 2015-04-19 | 1882 |
203 | GaTech VideoSeg | The GaTech VideoSeg dataset consists of two (waterski and yunakim?) video sequences for object segmentation. There exists no groundtruth segmentation annotat... | video segmentation object motion model camera | link | 2013-10-09 | 1996 |
202 | GaTech SegTrack | The SegTrack dataset consists of six videos (five are used) with ground truth pixelwise segmentation (6th penguin is not usable). The dataset is used for accura... | video segmentation object proposal flow optical motion model camera stationary groundtruth | link | 2013-10-09 | 1772 |
201 | 50 Salads | The dataset captures 25 people preparing 2 mixed salads each and contains over 4h of annotated accelerometer and RGB-D video data. Annotated activities correspo... | action activity recognition classification detection tracking video | link | 2013-10-05 | 1718 |
200 | Landmark 3D | This dataset provides a collection of web images and 3D models for research on landmark recognition (especially for methods based on 3D models). We hope it coul... | landmark recognition classification retrieval 3d reconstruction codebook matching feature flickr | link | 2016-08-09 | 1951 |
199 | THUR15000 | We introduce a labeled dataset of categorized images for evaluating sketch based image retrieval. Using Flickr, we downloaded about 3000 images for each of the ... | group saliency object detection visual attention sketch shape retrieval internet | link | 2020-06-06 | 2679 |
198 | THUS10000 | The THUS10000 benchmark dataset comprises of 10,000 images, each of which has an unambiguous salient object and the object region is accurately annotated with p... | segmentation saliency object detection visual attention | link | 2015-01-11 | 2263 |
197 | Stanford Background Dataset | The Stanford Background Dataset is a new dataset introduced in Gould et al. (ICCV 2009) for evaluating methods for geometric and semantic scene understanding. T... | semantic segmentation urban classification nature geometry | link | 2019-12-16 | 3939 |
196 | New College Data | The New College Data Set contains 30GB of data intended for use by the mobile robotics and vision research communities. Our anticipated users are parties intere... | odometry urban path 3d reconstruction panorama stereo navigation | link | 2013-09-30 | 1804 |
195 | Yotta | The Yotta dataset consists of 70 images for semantic labeling given in 11 classes. It also contains multiple videos and camera matrices for 14km or driving. ... | semantic segmentation urban video camera 3d reconstruction classification | link | 2013-09-30 | 1709 |
194 | HCI 4D Lightfields | The HCI 4D Lightfields dataset contains 11 objects with corresponding lightfields for depth estimation. Datasets can be downloaded individually below. For ma... | 3d 4d lightfield benchmark depth reconstruction evaluation | link | 2020-04-06 | 2547 |
193 | City planar and non-planar | The city planar and non-planar datset consists of urban scenes accompanied by text files describing the plane/non-plane locations. Training Set (University)... | plane detection 3d urban building estimation | link | 2022-12-21 | 1513 |
192 | Our Database of Faces | The Our Database of Faces (ORL) dataset contains ten different images of each of 40 distinct subjects. For some subjects, the images were taken at different tim... | face recognition illumination human expression | link | 2013-09-23 | 1873 |
191 | Daimler Mono Pedestrian Classification Benchmark | The Daimler Mono Pedestrian Classification Benchmark dataset consists of two parts: a base data set. The base data set contains a total of 4000 pedestrian- a... | pedestrian classification outdoor urban object scale illumination | link | 2013-09-18 | 1701 |
190 | Daimler Mono Pedestrian Detection Benchmark | The Daimler Mono Pedestrian Detection Benchmark dataset contains a large training and test set. The training set contains 15.560 pedestrian samples (image cut-o... | pedestrian detection outdoor urban mono scale object | link | 2013-09-18 | 1855 |
189 | Farman Institute 3D Point Sets | The Farman Institute 3D Point Sets dataset contains 11 objects by a 3D laser scanner. This dataset was peer-reviewed by Image Processing On Line: Farman Institu... | 3d laser scanner object reconstruction model point | link | 2013-09-18 | 1476 |
188 | KTH Multiview Football | The KTH Multiview Football dataset contains 771 images of football players includes images taken from 3 views at 257 time instances 14 annotated body joints. ... | multiview pedestrian tracking detection object camera outdoor game soccer pose recognition multitarget | link | 2018-06-28 | 3170 |
187 | Aspect Layout dataset | The Aspect Layout dataset is designed to allow evaluation of object detection for aspect ratios in perspective images. Author text: In this project we see... | detection object aspect ratio perspective layout | link | 2013-09-06 | 1336 |
186 | Symmetry Facades | The Symmetry Facades dataset contains 9 building facades with multiple images. It used for coupled symmetry and structure from motion detection. Coupled Str... | symmetry facade building urban reconstruction sfm 3d repetition | link | 2013-09-05 | 2121 |
185 | Kung-Fu fighter Multi-View | The test sequences provide interested researchers a real-world multi-view test data set captured in the blue-c portals. The data is meant to be used for testing... | multiview tracking segmentation camera action | link | 2013-10-08 | 1752 |
184 | MSR Learning to Rank | The MSR Learning to Rank are two large scale datasets for research on learning to rank: MSLR-WEB30k with more than 30,000 queries and a random sampling of it MS... | rank learning sampling search | link | 2019-08-16 | 1403 |
183 | MSR RGB-D 7-Scenes | The MSR RGB-D Dataset 7-Scenes dataset is a collection of tracked RGB-D camera frames. The dataset may be used for evaluation of methods for different applicati... | depth video kinect tracking location reconstruction | link | 2020-03-16 | 1770 |
182 | MSR Action | The MSR Action datasets is a collection of various 3D datasets for action recognition. See details http://research.microsoft.com/en-us/um/people/zliu/action... | video action recognition detection reconstruction 3d | link | 2013-09-05 | 1740 |
181 | All I Have Seen (AIHS) | The All I Have Seen (AIHS) dataset is created to study the properties of total visual input in humans, for around two weeks Nebojsa Jojic wore a camera capturin... | video summary user study clustering similarity outdoor indoor scene 3d | link | 2018-09-19 | 1620 |
180 | Airport MotionSeg | The Airport MotionSeg dataset contains 12 sequences of videos of an aiprort scenario with small and large moving objects and various speeds. It is challenging b... | motion segmentation airport video clustering camera zoom | link | 2013-09-04 | 1942 |
179 | CMP Facades | The CMP Facade dataset consists of facade images assembled at the Center for Machine Perception, which includes 600 rectified images of facades from various sou... | facade rectification urban semantic classification recognition structure similarity segmentation | link | 2019-11-09 | 1836 |
178 | VSUMM (Video SUMMarization) | The VSUMM (Video SUMMarization) dataset is of 50 videos from Open Video. All videos are in MPEG-1 format (30 fps, 352 x 240 pixels), in color and with sound. Th... | video summary type user study keyframe static similarity | link | 2024-07-11 | 2272 |
177 | SIPI textures | The Textures volume currently contains 154 images, all monochrome, 129 512x512 and 25 1024x1024. For the Brodatz texture images, the number in parenthesis (i... | texture, segmentation, classification, benchmark, synthetic, evaluation | link | 2013-08-20 | 2099 |
176 | Brodatz Album | The Brodatz dataset consists of 112 textures in grayscale images of various texture types. http://www.ee.oulu.fi/research/imag/texture/image_data/Brodatz32.h... | texture, segmentation, classification, benchmark, synthetic | link | 2014-12-23 | 2428 |
175 | Outex texture bench | The Outex dataset is part of a framework for empirical evaluation of texture classification and segmentation algorithms. The framework is being constructed acc... | texture, segmentation, classification, benchmark, synthetic | link | 2015-11-17 | 1730 |
174 | Pittsburgh Fast-food Image dataset | The Pittsburgh Fast-food Image dataset (PFID) consists of 4545 still images, 606 stereo pairs, 3033600 videos for structure from motion, and 27 privacy-preservi... | food recognition classification reconstruction video laboratory real | link | 2018-05-30 | 3344 |
173 | MuHAVi and MAS human action | The Multicamera Human Action Video Data (MuHAVi) Manually Annotated Silhouette Data (MAS) are two datasets consisting of selected action sequences for the eval... | human action behavior segmentation video background | link | 2020-05-07 | 3898 |
172 | DynTex dataset | The DynTex dataset consists of a comprehensive set of Dynamic Textures. Dynamic, or temporal, texture is a spatially repetitive, time-varying visual pattern tha... | texture, segmentation, dynamic, synthetic, video repetition | link | 2013-08-12 | 1604 |
171 | CHALEARN Multi-modal Gesture Challenge | The CHALEARN Multi-modal Gesture Challenge is a dataset +700 sequences for gesture recognition using images, kinect depth, segmentation and skeleton data. ht... | gesture, kinect, recognition, human, action, illumination, depth, segmentation, skeleton | link | 2013-08-09 | 1698 |
170 | Shef?eld Kinect Gesture (SKIG) dataset | The Shef?eld Kinect Gesture (SKIG) dataset contains 2160 hand gesture sequences (1080 RGB sequences and 1080 depth sequences) collected from 6 subjects. All the... | gesture, kinect, recognition, human, action, illumination, depth | link | 2023-06-14 | 3508 |
169 | QMUL Junction Dataset | The QMUL Junction dataset is a busy traffic scenario for research on activity analysis and behavior understanding. Video length: 1 hour (90000 frames) Fra... | detection tracking crowd counting pedestrian video motion behavior | link | 2016-12-06 | 2630 |
168 | Mall Dataset | The Mall dataset was collected from a publicly accessible webcam for crowd counting and profiling research. Ground truth: Over 60,000 pedestrians were label... | detection tracking crowd counting pedestrian indoor video webcam | link | 2020-04-28 | 3950 |
167 | Text and Vision (TVGraz) Dataset | The Text and Vision (TVGraz) dataset is an annotated multi-modal dataset which currently contains 10 visual object categories, 4030 images and associated text. ... | text appearance classification evaluation | link | 2020-02-04 | 2139 |
166 | ICG Multi-Camera Datasets | The ICG Multi-Camera datasets consist of Easy Data Set (just one person) Medium Data Set (3-5 persons, used for the experiments) Hard Data Set (crowded sc... | multiview pedestrian tracking detection object camera calibration graz indoor video multitarget | link | 2015-06-19 | 2589 |
165 | ICG Multi-Camera and Virtual PTZ | The ICG Multi-Camera and Virtual PTZ dataset contains the video streams and calibrations of several static Axis P1347 cameras and one panoramic video from a sph... | multiview pedestrian tracking detection object camera calibration graz network video panorama crowd outdoor multitarget | link | 2020-05-20 | 2559 |
164 | ICG Lab 6 (Multi-Camera Multi-Object Tracking) | The ICG Lab 6 (Multi-Camera Multi-Object Tracking) dataset contains 6 indoor people tracking scenarios recorded at our laboratory using 4 static Axis P1347 came... | multiview pedestrian tracking detection object laboratory camera calibration evaluation segmentation graz | link | 2017-12-05 | 3232 |
163 | TUGRAZ ICG Longterm Pedestrian Dataset | The Longterm Pedestrian dataset consists of images from a stationary camera running 24 hours for 7 days at about 1 fps. It used for adaptive detection and back... | pedestrian change detection background illumination robust indoor coffee graz multitarget | link | 2020-05-04 | 2062 |
162 | ICG PRID 2011 | The Person Re-ID (PRID) 2011 dataset was created in co-operation with the Austrian Institute of Technology for the purpose of testing person re-identification a... | pedestrian classification identification multiview trajectory illumination appearance change graz | link | 2017-06-29 | 2476 |
161 | ICG Annotated Facial Landmarks in the Wild (AFLW) | The Annotated Facial Landmarks in the Wild (AFLW) consists of a large-scale collection of annotated face images gathered from the web, exhibiting a large variet... | face detection landmark pose age annotation | link | 2020-02-18 | 4849 |
160 | Caltech Lanes Dataset | The Caltech Lanes dataset includes four clips taken around streets in Pasadena, CA at different times of day. The archive below includes 1225 individual frame... | urban road lane detection caltech pasadena | link | 2013-08-08 | 2092 |
159 | Caltech Game Covers Dataset | The Caltech Game Covers dataset consists of CD/DVD covers of video games. The set was downloaded from freecovers.net during the summer of 2008. The set includes... | classification retrieval game cover caltech hierarchy taxonomy | link | 2014-02-20 | 1549 |
158 | Caltech Buildings Dataset | The Caltech Buildings dataset consists of images taken for 50 buildings around the Caltech campus. Five different images were taken for each building from diffe... | building urban retrieval hierarchy taxonomy caltech | link | 2013-08-08 | 1594 |
157 | Background Models Challenge (BMC) | Background Models Challenge (BMC) is a complete dataset and competition for the comparison of background subtraction algorithms. The main topics concern: -... | background modeling change motion detection surveillance video segmentation | link | 2020-06-15 | 3103 |
156 | KUL Belgium Traffic Signs | BelgiumTS is a large dataset with 10000+ traffic sign annotations, thousands of physically distinct traffic signs. 4 video sequences recorded with 8 high resolu... | traffic sign classification urban road belgium camera calibration | link | 2020-01-07 | 2692 |
155 | KUL Belgium Traffic Sign Classification | BelgiumTSC dataset is built for traffic sign classification purposes. Is is a subset of BelgiumTS dataset and contains cropped images around annotations for 62 ... | traffic sign classification urban road belgium | link | 2018-10-22 | 1969 |
154 | WordNet | WordNet is a large lexical database of English. Nouns, verbs, adjectives and adverbs are grouped into sets of cognitive synonyms (synsets), each expressing a di... | language, hierarchy, imagenet, classification | link | 2013-08-07 | 1375 |
153 | MSRC Kinect Gesture Dataset | The Microsoft Research Cambridge-12 Kinect gesture dataset consists of sequences of human movements, represented as body-part locations, and the associated gest... | gesture, kinect, recognition, human, action | link | 2013-08-08 | 1829 |
152 | Colosseum and San Marco | The Colosseum and San Marco are two image datasets for dense multiview stereo reconstructions used for evaluating the visual photo realism. The datasets are ... | 3d, reconstruction, landmark, urban, sfm, aerial, street, flickr | link | 2017-11-28 | 2297 |
151 | People in WBCN | This dataset is for people tracking in wide baseline camera networks and was designed as a contest at ICPR 2012. The contest consists of two challenges: ... | detection, tracking, pedestrian, trajectory, crowd, overlap, occlusion, aerial | link | 2013-08-02 | 2713 |
150 | SDHA Contest | The Semantic Description of Human Activities (SDHA) was a contest at ICPR 2010. The contest is composed of three different types of activity recognition cha... | detection, tracking, pedestrian, trajectory, crowd, overlap, occlusion, aerial | link | 2013-07-31 | 2077 |
149 | NYU Depth v2 | The NYU-Depth V2 data set is comprised of video sequences from a variety of indoor scenes as recorded by both the RGB and Depth cameras from the Microsoft Kinec... | semantic segmentation depth kinect label reconstruction | link | 2017-06-01 | 4290 |
148 | NYU Depth v1 | The NYU-Depth data set is comprised of video sequences from a variety of indoor scenes as recorded by both the RGB and Depth cameras from the Microsoft Kinect. ... | semantic segmentation depth kinect label reconstruction | link | 2014-10-05 | 2309 |
147 | FlickrLogos-32 | The FlickrLogos-32 dataset contains photos showing brand logos and is meant for the evaluation of multi-class logo recognition as well as logo retrieval methods... | flickr, logo, detection, retrieval, image, object recognition, machine learning, classification brand boundingbox | link | 2019-11-12 | 2849 |
146 | Multiple Instance Learning dataset | MIL data sets used in our 2002 NIPS paper for Elepphant, Musk, TREC http://www.cs.cmu.edu/~juny/MILL/MIL-experiments.htm... | machine learning, classification | link | 2013-05-30 | 1537 |
145 | KnapSack | KNAPSACK_01 is a dataset directory which contains some examples of data for 01 Knapsack problems. In the 01 Knapsack problem, we are given a knapsack of fixe... | machine learning, classification | link | 2018-06-04 | 1640 |
144 | MNIST hand-written letters | The MNIST database of handwritten digits, available from this page, has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of ... | text, classification, letter | link | 2017-06-03 | 2574 |
143 | KITTI Odometry | http://www.cvlibs.net/datasets/kitti/eval_odometry.php Related Datasets TUM RGB-D Dataset: Indoor dataset captured with Microsoft Kinect and high-accuracy... | registration, localization, odometry, slam, matching, navigation, urban path 3d reconstruction | link | 2013-09-30 | 2158 |
142 | German Traffic Sign Recognition Benchmark | The German Traffic Sign Recognition Benchmark is a dataset for multi-class detection problem in natural images and do cordially invite you to participate. The b... | detection, traffic, urban, recognition | link | 2016-08-15 | 2424 |
141 | Berkeley Multimodal Human Action Database (MHAD) | The Berkeley Multimodal Human Action Database (MHAD) contains 11 actions performed by 7 male and 5 female subjects in the range 23-30 years of age except for on... | action classification multiview motion recognition | link | 2014-02-03 | 1843 |
140 | RGB-D Person Re-identification | The RGB-D Person Re-identification dataset is for person re-identification using depth information. The main motivation is that the standard techniques (such as... | identification, classification, shape, depth, pedestrian, 3d | link | 2014-10-08 | 2671 |
139 | image panorama gdbicp | Generalized Dual Bootstrap-ICP Algorithm ... | registration, panorama, matching | link | 2013-05-21 | 1478 |
138 | Buffy | The Buffy dataset contains images selected from the TV series, Buffy: the Vampire Slayer. We select a set of 452 images from the first two episodes for training... | segmentation, detection, buffy, movie, human | link | 2015-02-07 | 1515 |
137 | Synthetic CAD models | The Synthetic CAD Models dataset consists of X synthetic CAD models for detection (planar) primitives. Efficient RANSAC for Point-Cloud Shape Detection Ruwe... | model, ransac, 3d object, reconstruction, primitive, synthetic | link | 2013-08-08 | 1563 |
136 | 3D Object in Clutter Recognition and Segmentation | The dataset is composed of 150 synthetic scenes, captured with a (perspective) virtual camera, and each scene contains 3 to 5 objects. The model set is composed... | recognition, segmentation, mesh, synthetic | link | 2013-08-08 | 1703 |
135 | Quad 6K | The Quad 6K dataset is a Structure-from-Motion dataset taken at Arts Quad at Cornell University campus and consists of 6514 images with ground truth positions o... | reconstruction, sfm, urban, groundtruth, landmark, 3d gps | link | 2020-03-19 | 2020 |
134 | Image Memorability | Image memorability dataset contains target and filler images, precomputed features and annotations, and memorability. It gives features and annotations for t... | aesthetics, semantic, quality, memorability | link | 2013-04-17 | 1494 |
133 | Aesthetic Photo.net | The Photo.net dataset contains 20,278 images with properties to aesthetics, emotion, and image quality. The only differences are that it is a larger dataset, an... | aesthetics, quality, emotion | link | 2020-05-12 | 2489 |
132 | Aesthetic Visual Analysis | Aesthetic Visual Analysis (AVA) dataset studies the organization of content by aesthetic preference. It contains over 250,000 images along with a rich variety o... | aesthetics, semantic, quality, memorability | link | 2017-01-10 | 2489 |
131 | Dubrovnik6K and Rome16K | The Dubrovnik6K and Rome16K datasets are image collections for SfM reconstruction, where the suffix refers to the number of images in the dataset. Dubrovnik6... | reconstruction, sfm, urban, landmark, dubrovnik, rome | link | 2019-12-14 | 1983 |
129 | NBVbench | The NBVbench is a reference object and benchmark criteria for defining and evaluating the performance of a next best view (NBV) method. ... | reconstruction, view, planning, geometry | link | 2013-04-16 | 1353 |
128 | The Stanford 3D Scanning Repository | The Stanford 3D Scanning Repository dataset is a compilation of 3D scans of objects like Stanford Bunny, Happy Buddha, Dragon, Armadillo and Lucy. These contain... | reconstruction, laser, bunny, triangulation | link | 2013-03-21 | 1978 |
127 | Stable Structure from Motion | The Stable Structure from Motion datasets due to size limitations cannot put the images online. Instead here are the tracked image points and the final reconstr... | sfm, reconstruction, geometry, stability, robust, 3d, landmark, church | link | 2013-08-08 | 2401 |
126 | ISPRS Urban Classification | ISPRS Test Project on Urban Classification and 3D Building Reconstruction The ISPRS working group III/4 announces the release of the 2D semantic labeling ben... | 3d, reconstruction, building, urban, city, semantic, classification, recognition | link | 2014-11-24 | 1588 |
125 | Google Street View Pittsburgh Research | The Google Street View Pittsburgh Research dataset is a street-level image collection provided by Google for research purposes. The dataset provided here co... | 3d, reconstruction, sfm, urban, pittsburgh, panorama | link | 2020-03-10 | 3544 |
124 | CMU Geometric Context | The CMU Geometric Context dataset by Derek Hoiem, Alexei A. Efros, Martial Hebert consists of 300 images used for training and testing the geometric context met... | reconstruction, single view, depth, context, geometry | link | 2016-06-29 | 1676 |
123 | CMU/VMR Urban Image+Laser | CMU/VMR Urban Image+Laser dataset contains 372 images linked with 3D laser points projections. There are additional images (due to the laser scanner being turne... | reconstruction, sfm, urban, semantic, segmentation, laser | link | 2013-04-02 | 1885 |
122 | Symmetric Bundle Adjustment | The Symmetric Bundle Adjustment dataset contains four sequences of the CAB building, Barcelona, Redmond and Capitole for 3D reconstruction considering symmetrie... | reconstruction, sfm, urban, bundle adjustment, symmetry | link | 2013-03-12 | 1618 |
121 | Oakland 3D | This repository contains labeled 3-D point cloud laser data collected from a moving platform in a urban environment. Data are provided for research purposes. ... | reconstruction, sfm, urban, semantic, segmentation, laser lidar 3d city | link | 2018-10-10 | 1978 |
120 | Samantha | The SAMANTHA (Structure-and-Motion Pipeline on a Hierarchical Cluster Tree) dataset contains 4 sequences for 3D reconstruction: Pozzoveggiani, Piazza Dante, Pia... | reconstruction, sfm, landmark, model, geometry | link | 2013-03-12 | 1987 |
119 | AdelaideRMF | AdelaideRMF: Robust Model Fitting Data Set AdelaideRMF is a data set for robust geometric model fitting (homography estimation and fundamental matrix estimat... | feature, matching, getry, model | link | 2017-11-21 | 2130 |
118 | Hermann Maier Nagano 1998 | The Hermann Maier Nagano 1998 dataset is used for deformable extremely difficult tracking scenario. There are 4 parts to this sequence a) 00088 - 00554 Maie... | tracking, single, trajectory, clutter, deformation | n/a | 2013-03-12 | 1415 |
117 | YorkUrbanDB | The York Urban Line Segment Database is a compilation of 102 images (45 indoor, 57 outdoor) of urban environments consisting mostly of scenes from the campus of... | vanishing, point, pose, urban, reconstruction, outdoor, geometry, manhattan | link | 2013-09-18 | 1519 |
116 | Sheffield Building | Sheffield Building Image Dataset consists of over 3,000 low-resolution images of forty different buildings – typically between 70 and 120 images per building.... | retrieval, classification, urban, sheffield | link | 2013-03-12 | 1466 |
115 | Pankrac Marseille | Our repetitive pattern dataset with 106 images of app. 30 buildings from Pankrac, Prague and Marseille appearing in more than one image, number of appearances r... | classification, retrieval, symmetry, repetition, urban | link | 2013-03-13 | 1382 |
114 | TUD Shapes 1+2 | This material is supplementary to Michael Stark, Bernt Schiele. How Good are Local Features for Classes of Geometric Objects. Eleventh IEEE International C... | shape object classification tool binary | link | 2013-08-08 | 2566 |
113 | Penn-Fudan Pedestrian | Penn-Fudan Pedestrian Detection and Segmentation... | pedestrian detection segmentation background motion | link | 2013-08-08 | 1935 |
112 | SHREC | Unlike the previous SHREC contests, the objective of this SHREC 2012 contest is to evaluate the performance of 3D-mesh segmentation techniques instead of evalua... | segmentation, mesh, part, 3d | link | 2013-07-29 | 1345 |
111 | Grabcut | To evaluate our method we designed a new ground truth database of 50 images. The following zip-files contain: Data, Segmentation, Labelling - Lasso, Labelling -... | segmentation, boundingbox, color, optimization, background | link | 2015-06-19 | 1498 |
110 | EITZ Sketch Quality | Humans have used sketching to depict our visual world since prehistoric times. Even today, sketching is possibly the only rendering technique readily available ... | shape, matching, retrieval, partial, sketch | link | 2022-01-05 | 1948 |
109 | EITZ Sketch-Based Image Retrieval | We introduce a benchmark for evaluating the performance of large scale sketch-based image retrieval systems. The necessary data is acquired in a controlled user... | shape, matching, retrieval, partial, sketch | link | 2014-02-11 | 2220 |
108 | ICG Sketch Retrieval | The ICG Sketch Retrieval dataset consists of XXX hand-drawn sketches for five categories. It is used for content-based image retrieval using shape features for ... | shape, matching, retrieval, partial, sketch | n/a | 2014-02-11 | 2352 |
107 | BIWI Pedestrians | We provide the three datasets used for testing our system for our ICCV 2007 publication, including annotations. Data was recorded using a pair of AVT Marlins mo... | detection, tracking, pedestrian, trajectory, crowd, overlap, occlusion | link | 2013-03-12 | 2559 |
106 | BIWI Walking Pedestrians (EWAP) | The BIWI Walking Pedestrians (EWAP) dataset shows walking pedestrians in busy scenarios from a bird eye view. Manually annotated. Data used for training in our ... | detection, tracking, pedestrian, trajectory, crowd, overlap, occlusion, aerial | link | 2020-05-04 | 3916 |
105 | MSR 3D Video | These sequences were used for our video interpolation work described in High-quality video view interpolation using a layered representation, C.L. Zitnick, ... | reconstruction, camera, segmentation, depth | link | 2013-03-12 | 1802 |
104 | Make3D Depth | The Make3D Depth dataset s designed to learn features to estimate scene depth from a single image. This dataset contains aligned image and range data: Make3... | depth, learning, single view, outdoor, indoor | link | 2019-04-03 | 2648 |
103 | COIL-100 | The COIL-100 (Columbia University Image Library) consists of 100 objects. For formal documentation look at the corresponding compressed technical report, [gzipp... | classification, retrieval | link | 2013-03-12 | 1456 |
102 | Tiny Images | The Tiny Images dataset consists of 79,302,017 images, each being a 32x32 color image. This data is stored in the form of large binary files which can be accese... | classification, tiny, color, retrieval | link | 2013-03-12 | 1574 |
101 | CIFAR-10 / 100 | The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images. ... | classification, tiny, color, patch, scene, object | link | 2013-08-08 | 1586 |
100 | Sowerby | The Sowerby dataset contains 105 images for semantic segmentation.... | semantic, segmentation, outdoor | n/a | 2014-09-26 | 1672 |
99 | BSDS500 | This new dataset is an extension of the BSDS300, where the original 300 images are used for training / validation and 200 fresh images, together with human anno... | segmentation, edge, contour, detection | link | 2019-05-18 | 1722 |
98 | BSDS300 | The goal of this work is to provide an empirical basis for research on image segmentation and boundary detection. To this end, we have collected 12,000 hand-la... | segmentation, edge, contour, detection | link | 2013-03-12 | 2422 |
97 | KU Leuven Facade | The KU Leuven Facade dataset is used for architectural styles classification. M. Mathias, A. Martinovic, J. Weissenberg, S. Haegler, L. Van Gool: Automatic A... | classification, architecture, urban | link | 2018-06-11 | 2054 |
96 | USPS Handwritten Digits | Name: Classes Train. Ex. Test. Ex. Features USPS 10 7291 2007 256 8-bit grayscale images of "0" through "9"; handwritten digits; ... | text, recognition, classification, handwritten | link | 2020-06-08 | 4588 |
95 | Stroke Width Transform Text | Stroke Width Transform Text dataset is by Boris Epstein and consists of 307 images and XXX text instances. Detecting Text in Natural Scenes with Stroke Wid... | text, detection, recognition, classification | link | 2015-04-24 | 1994 |
94 | Chars74K | The Chars74K dataset consists of 64 classes (0-9, A-Z, a-z), 7705 characters obtained from natural images, 3410 hand drawn characters using a tablet PC, 62992 s... | text, detection, recognition, classification | link | 2018-08-28 | 2827 |
93 | Street View Text | The Street View Text (SVT) dataset contains 647 words and 3796 letters in 249 images harvested from Google Street View. The dataset is more challenging becaus... | text, detection, recognition, classification, outdoor, urban | link | 2014-01-13 | 1952 |
92 | ICDAR 2011 | This challenge is set up around three tasks: Text Localisation, Text Segmentation and Word Recognition. Participation in any or all tasks is welcome. Check the ... | text, detection, recognition, classification | link | 2016-06-01 | 1513 |
91 | ICDAR 2003 | The ICDAR 2003 datasets available for download on this site: Robust Reading , Robust Word Recognition , Robust OCR , Text Locating and Cursive Script . Pleas... | text, detection, recognition, classification | link | 2018-05-16 | 1817 |
90 | eTrims | The eTrims dataset is comprised of two datasets, the 4-Class eTRIMS Dataset with 4 annotated object classes and the 8-Class eTRIMS Dataset with 8 annotated obje... | semantic, segmentation, urban, reconstruction | link | 2013-03-12 | 1356 |
89 | Corel Photo Gallery | This image database is a part of the "Corel Gallery Magic" (commercial product). It contains 80000 images divided into 800 categories of 100 images. These image... | semantic, segmentation, outdoor | n/a | 2017-01-19 | 1588 |
88 | Change Detection | The dataset folder contains 7 folders (one for each category). Each category folder contains 4 to 6 folders (one for each video). Each video folder contains: ... | change, detection, background | link | 2013-03-13 | 1491 |
87 | Simpsons 40 years | Simpsons Homer 40 years is a dataset showing Homer Simpson over the course of 40 years. It is used for video segmentation and shape matching between frames.... | video, segmentation, shape, matching | n/a | 2023-04-28 | 2306 |
86 | ICG Graz240 | The ICG Graz240 dataset consists of 240 buildings with 5400 redundant images with a total of 5542 window instances. Window detection itself is difficult due to ... | segmentation, detection, semantic, urban, graz | link | 2018-12-05 | 2439 |
85 | Leaves | The Leaves dataset from X contains X images of leaves. Leaves dataset taken by Markus Weber. California Institute of Technology PhD student under Pietro Per... | shape, binary, matching, retrieval, partial | n/a | 2023-10-22 | 3180 |
84 | Aachen Retrieval | The Aachen dataset consists of 4479 images taken with multiple cameras (3GB), 369 query images taken with the camera of a mobile phone together with their SIFT ... | retrieval, aachen, landmark, sfm, reconstruction | link | 2013-03-11 | 2007 |
83 | Ikonos Aerial | Since its launch in September 1999, Space Imaging IKONOS earth imaging satellite has provided a reliable stream of image data that has become the standard for c... | reconstruction, sfm, urban, aerial | link | 2013-03-11 | 1867 |
82 | Zurich City Hall | Zurich City Hall dataset (also CIPA dataset) nformation: Place: City Hall, Zurich, Switzerland Number of Images: 15, 1280 x 1000 pixels Camera: Fuji DS 30... | reconstruction, sfm, urban, zurich | link | 2018-03-05 | 1955 |
81 | Zurich Hoengg | Zurich Hoengg (Switzerland) is an aerial dataset. The dataset consists of 4 aerial images in colour (Figures 2-5), scanned with 14 microns, the format is Ti... | aerial, semantic, segmentation, outdoor | link | 2013-03-11 | 1718 |
80 | Hopkins 155 | The Hopkins 155 Dataset has been created with the goal of providing an extensive benchmark for testing feature based motion segmentation algorithms. It contains... | flow, stereo, motion, segmentation, urban | link | 2015-04-01 | 2385 |
79 | LabelMe | The goal of LabelMe is to provide an online annotation tool to build image databases for computer vision research. You can contribute to the database by visitin... | segmentation, semantic, outdoor, detection, urban, software | link | 2013-03-14 | 1757 |
78 | Caltech Pedestrian | The Caltech Pedestrian Dataset consists of approximately 10 hours of 640x480 30Hz video taken from a vehicle driving through regular traffic in an urban environ... | pedestrian, detection, urban | link | 2019-12-11 | 3599 |
77 | Daimler Stereo Pedestrian | Daimler Stereo Pedestrian Detection Benchmark C. Keller, M. Enzweiler, and D. M. Gavrila, A New Benchmark for Stereo-based Pedestrian Detection, Proc. of th... | pedestrian, detection, urban | link | 2013-03-13 | 1697 |
76 | Daimler Pedestrian Classification | Daimler Multi-Cue, Occluded Pedestrian Classification Benchmark Training and test samples have a resolution of 48 x 96 pixels with a 12-pixel border around t... | detection, classification, pedestrian, urban | link | 2013-03-11 | 1934 |
75 | ETHZ Shape | The ETHZ Shape classes dataset from Vittorio Ferrari [?] consists of five object classes and a total of 255 images. All classes contain significant intra-class ... | shape, detection, matching, segmentation, clutter, applelogo, bottle, giraffe, nature, swan, mug | link | 2014-02-11 | 2930 |
74 | PMVS 3D Photography | The following are multiview stereo data sets captured in our lab: a set of images, camera parameters and extracted apparent contours of a single rigid object. E... | sfm, reconstruction, depth, dense, mesh | link | 2017-01-31 | 2208 |
73 | Strecha Dense MVS | An evaluation benchmark for dense MVS for these datasets fountain-P11, Herz-Jesu-P8, entry-P10, castle-P19, Herz-Jesu-P25, castle-P30 . Images (corrected for... | sfm, reconstruction, benchmark, depth, dense, mesh | link | 2017-12-11 | 3690 |
72 | Acute3D Aiguille du Midi MVS | Aiguille du Midi. France showing photographs with Camera: Mamiya ZD. 55mm. - Resolution: 5Mpixels, 53 images - Photographer: B. Vallet (Imagine/EVD - 2006) ... | sfm, reconstruction, mesh, large scale, outdoor | link | 2013-03-21 | 1828 |
71 | University of Tsukuba Stereo Flow | This dataset contains 1800 stereo pairs with ground truth disparity maps, occlusion maps and discontinuity maps that will help to further develop the state of t... | flow, depth, stereo, graphics, synthetic, optical, | link | 2013-08-08 | 3377 |
70 | MPI Sintel Flow | The MPI Sintel Flow dataset is an optical flow / stereo dataset based on the Blender movie Sintel: http://www.sintel.org The goal of this project was to crea... | flow, depth, stereo, graphics, synthetic | link | 2013-08-08 | 1978 |
69 | HCI Robust Vision | Estimate robust and reliable depth or motion fields on our challenging real world videos! ... | flow, depth, stereo, outdoor | link | 2016-09-08 | 1972 |
68 | The KITTI Vision Benchmark Suite | We take advantage of our autonomous driving platform Annieway to develop novel challenging real-world computer vision benchmarks. Our tasks of interest are: ste... | stereo, depth, flow, detection tracking, reconstruction, sfm, odometry, segmentation, semantic car depth | link | 2017-11-26 | 2841 |
67 | Middlebury MVS Dino | The object is a plaster dinosaur (stegosaurus). Click on thumbnail for a full-sized (640x480) image. Resolution of ground truth model: 0.00025m (you may wish to... | sfm, reconstruction, benchmark, multiview, 3d, | link | 2013-09-20 | 1905 |
66 | Middlebury MVS Temple | The object is a plaster reproduction of Temple of the Dioskouroi in Agrigento, Sicily. Click on thumbnail for a full-sized (640x480) image. Resolution of ground... | sfm, reconstruction, benchmark, multiview, 3d, | link | 2013-09-20 | 1787 |
65 | Middlebury Stereo | Two-frame stereo depth estimation... | flow, depth, stereo | link | 2014-03-06 | 2043 |
64 | Middlebury Flow | ... | flow, depth | link | 2013-03-11 | 1450 |
63 | Paris500k | The Paris500k dataset consists of 501,356 geotagged images collected from Flickr and Panoramio. The dataset was collected from a geographic bounding box rather ... | retrieval, paris, landmark, geotag, flickr, panoramio, sfm, reconstruction | link | 2019-05-22 | 2283 |
62 | Deformed Lattice Detection | The Deformed Lattice Detection In Real-World Images dataset is used for regular grid detection. The authors have developed a robust and fast lattice detection a... | texture, segmentation, symmetry, lattice, detection, urban | link | 2013-03-11 | 1672 |
61 | Digital Papercutting | Papercutting is a widespread and ancient artform which, as far as we could tell, had no previous computational treatment. We developed algorithms to analyze the... | symmetry | link | 2013-03-11 | 1275 |
60 | PSU HUB | The PSU HUB dataset is a detection, tracking dataset. Ground truth trajectory and grouping information for pedestrians walking in the PSU student union building... | detection, tracking, pedestrian, trajectory, crowd, overlap, occlusion | link | 2013-07-19 | 2523 |
59 | Near-Regular Textures | The Near-Regular Textures dataset contains textures from completely regular to completely irregular patterns, with a focus on near-regular textures. It also inc... | texture, segmentation, classification, symmetry, regular, stochastic | link | 2013-03-11 | 1705 |
58 | INRIA Horses | The INRIA Horses dataset from Frederic Jurie and Vittorio Ferrari consists of 170 images with one or more horses in side-view at several scales and cluttered ba... | detection, shape, segmentation, clutter, nature, horse | link | 2013-03-11 | 2818 |
57 | Weizmann Horses | The multi-scale Weizmann horses (originally from Eran Borenstein, adapted by Jamie Shotton) consists of 656 images which is split into 50+50training, 50+50 vali... | detection, shape, segmentation, clutter, nature, horse | link | 2013-03-11 | 3283 |
56 | ETHZ Extended Shape | The ETHZ Extended Shape classes dataset from Konrad Schindler is larger dataset of shape categories, created by merging ETHZ shape classes with Konrad Schindler... | detection, shape, segmentation, clutter | link | 2013-03-11 | 2212 |
55 | Prague Texture Segmentation | The Prague Texture Segmentation Datagenerator and Benchmark is designed to mutually compare and rank different (dynamic/static) texture segmenters (supervised o... | texture, segmentation, classification, benchmark, synthetic | link | 2013-08-08 | 1752 |
54 | Notre Dame | The Notre Dame de Paris dataset used for 3D SfM reconstruction and contains 715 images provided by Noah Snavely. There are also version for NotreDame by Mic... | limited, flickr, landmark, sfm, paris, frontview, reconstruction, 3d, pointcloud | link | 2015-06-19 | 2072 |
53 | DTU Robot | The DTU Robot dataset consists of color images of 60 scenes acquired in a controlled setup from 119 different positions and under different lighting. For each s... | feature, detection, description, matching, sfm, reconstruction, illumination | link | 2016-05-15 | 2002 |
52 | Graffiti | The Graffiti dataset by Krystian Mikolajczyk and Cordelia Schmid contains 48 images split into 8 sequences with 6 images each showing different structured and t... | feature, detection, description, rectification, benchmark | link | 2019-07-10 | 1764 |
51 | PN Learning | PN Learning - How does TLD work? Tracking estimates the object location as long as the object is visible. During tracking all observed patterns of the object... | single target tracking learning object pedestrian bike face | link | 2017-11-28 | 1786 |
50 | Babenko tracking | The Babenko tracking dataset contains 12 video sequences for single object tracking. For each clip they provide (1) a directory with the original image s... | tracking single object animal face occlusion video | link | 2016-08-08 | 3719 |
49 | PhotoTourism Pair Patch | The data is taken from Photo Tourism reconstructions from Trevi Fountain (Rome), Notre Dame (Paris) and Half Dome (Yosemite). Each dataset consists of a series ... | feature matching description pair sfm patch learning | link | 2018-01-10 | 1850 |
48 | CALTECH 101 Category Patch Pairs | The CALTECH 101 Category Patch Pairs dataset measures invariance to intra-category variation. The dataset contains a training set and testing set of image patc... | feature, matching, description, pair, binary | link | 2017-02-14 | 3974 |
47 | CMP Retrieval | CMP Dataset by Ondra Chum contains 5 million images collected from the internet.... | retrieval, urban, large scale | link | 2013-03-11 | 1575 |
46 | Paris Retrieval | The Paris dataset consists of 6412 images. Images have high resolution and are in JPEG format. http://www.robots.ox.ac.uk/~vgg/data/parisbuildings/paris_1.... | retrieval, urban, paris, landmark | link | 2016-10-11 | 1788 |
45 | Oxford Buildings | The Oxford Buildings dataset by James Philbin and Andrew Zisserman consists of 5062 images collected from Flickr by searching for particular Oxford landmarks. T... | retrieval, urban, oxford, landmark | link | 2017-04-17 | 1968 |
44 | UK Bench | The UK Bench dataset from Henrik Stewenius and David Nister contains 10200 images of N=2550 groups with each four images at size 640x480. The images are rotated... | retrieval image object centered rotation | link | 2020-04-02 | 4277 |
43 | ZuBud | The Zurich Building dataset (ZuBud) from Hao Shao, Tomas Svoboda and Luc Van Gool [?] contains 1005 images with 201 buildings each in five views. There is also ... | retrieval, urban, procedural, rectification | link | 2013-03-11 | 1928 |
42 | Hollywood Videos | Hollywood-2 datset contains 12 classes of human actions and 10 classes of scenes distributed over 3669 video clips and approximately 20.1 hours of video in t... | action, classification, video, segmentation | link | 2013-03-12 | 2369 |
41 | KTH Action | The current video database containing six types of human actions (walking, jogging, running, boxing, hand waving and hand clapping) performed several times by 2... | action, classification, video, segmentation | link | 2013-03-12 | 1685 |
40 | Weizmann Action | The Weizmann actions dataset by Blank, Gorelick, Shechtman, Irani, and Basri consists of ten different types of actions: bending, jumping jack, jumping, jump in... | video, segmentation, action, classification | link | 2015-07-14 | 1842 |
39 | Leuven Stereo Scene | The Leuven Stereo Scene dataset is a scene and depth dataset. There exist two variants of this dataset - a CVPR 2007 paper [1] by Leibe et al. for detection and... | segmentation, semantic, reconstruction, urban, sfm, 3d, leuven, depth, stereo | link | 2018-06-28 | 3599 |
38 | IcgBench | The Interactive Segmentation (IcgBench) dataset from Jakob Santner contains 243 images and 262 segmentation. Some images have multiple segmentations. The annota... | interactive, segmentation, user | link | 2018-12-11 | 1626 |
37 | MSRC vNIPS | The MSRC vNIPS dataset is the MSRC v2 dataset with new annotations for much more accurate segmentations for 93 images. Efficient Inference in Fully Connected... | segmentation, semantic, outdoor | link | 2013-03-11 | 1657 |
36 | MSRC v2 | The MSRC v2 dataset is an extension of the MSRC v1 dataset from Microsoft Research in Cambridge. It contains 591 images and 23 object classes with accurate pixe... | segmentation, semantic, outdoor | link | 2016-08-28 | 3975 |
35 | MSRC v1 | The MSRC v1 dataset from Microsoft Research in Cambridge contains 240 images and 9 object classes with coarse pixel-wise labeled images. The dataset is commonl... | segmentation, semantic, outdoor | link | 2016-09-07 | 3473 |
34 | CamVid | The Cambridge-driving Labeled Video Database (CamVid) dataset from Gabriel Brostow [?] contains ten minutes of video footage and corresponding semantically labe... | sfm, depth, semantic, segmentation, urban | link | 2020-05-04 | 7396 |
33 | ECP New York 2011 | The ECP New York dataset contains 10 manually segmented buildings from New York City, USA. Segmentation evaluating using Dice coefficient is calculated for the ... | segmentation, semantic, procedural, reconstruction, urban, newyork | link | 2013-08-08 | 1549 |
32 | ECP Paris 2011 | The ECP Paris 2011 dataset consists of 104 images taken from rue Monge in the fifth district of Paris, we kept only 20 for training and 10 for testing. Howev... | segmentation, semantic, procedural, reconstruction, urban, paris | link | 2013-08-08 | 1650 |
31 | ECP Paris 2010 | The Ecole Centrale Paris 2010 (Paris 2010) dataset consists of 30 images of densely annotated building facades in seven classes - wall, window, sky, shop, balco... | segmentation, semantic, procedural, reconstruction, urban, paris | link | 2013-03-11 | 1862 |
30 | ICG Graz50 | This is a dataset of rectified facade images and semantic labels. The goal of the annotation is to study the layout of the facades. It contains 50 images of... | segmentation, semantic, procedural, reconstruction, urban, graz | link | 2014-01-28 | 2099 |
29 | The Yale Face | The Yale Face dataset from A. Georghiades contains 5760 single light source images of ten subjects, each shown in 9 poses and 64 illumination setups (leading to... | face, pedestrian, detection, pose, illumination | link | 2015-06-23 | 2042 |
28 | CMU Faces - Frontal faces | The MIT + CMU frontal face dataset from H. Rowley contains 130 images with 507 labeled frontal faces from movie, portrait and media sources. It is mostly graysc... | frontview, face, detection object boundingbox | link | 2015-06-19 | 1939 |
27 | Idiap/ETHZ Faces and Poses | Idiap/ETHZ Faces and Poses Dataset dataset by L. Jie, B. Caputo and V. Ferrari contains 1703 image-caption pairs. [author] Captions contain the names of some of... | face, pose, pedestrian, text | link | 2013-03-11 | 1758 |
26 | We Are Family Stickmen | The We Are Family Stickmen dataset from Eichner and Ferrari contains X images with X people in group photos for human pose estimation with annotated 2D human bo... | pose, pedestrian, body part | link | 2013-03-11 | 1812 |
25 | PASCAL VOCs | The PASCAL VOC Challenge datasets by Mark Everingham is a yearly dataset which has a central evaluation server and the final test data is not released. The late... | detection segmentation pose pedestrian chair animal car building airplane | link | 2020-03-23 | 2333 |
24 | UIUC Cars | This UIUC Cars dataset by Shivani Agarwal, Aatif Awan and Dan Roth contains images of side views of cars for use in evaluating object detection algorithms. The ... | car, sideview, detection, scale, recognition, urban, scale | link | 2019-08-28 | 2447 |
23 | Graz02 | The Graz02 dataset by Andreas Opelt and Axel Pinz contains four categories of images: bikes, people, cars and a single background class. The annotation has been... | bike, pedestrian, background, detection, clutter, car, graz | link | 2014-04-24 | 2379 |
22 | Graz01 | The Graz01 dataset by Andreas Opelt and Axel Pinz contains four types of images: bikes, people, background with no bikes, background with no people.... | bike, pedestrian, background, detection, clutter, graz, occlusion | link | 2013-08-08 | 2385 |
21 | ImageNET | The ImageNET dataset is the latest dataset by Li Fei-Fei containing various dataset ranging from 1000 to 10000 categories.... | retrieval, segmentation, classification | link | 2013-03-11 | 2093 |
20 | CALTECH 256 | The CALTECH 256 dataset by Li Fei-Fei contains 30607 images for 256 categories.... | classification centered object scene image | link | 2013-08-08 | 1811 |
19 | CALTECH 101 | The CALTECH 101 dataset by Li Fei-Fei contains images for 101 categories with about 40 to 800 images per category. Most categories have about 50 images at rough... | classification centered object scene image | link | 2013-08-08 | 1854 |
18 | Leeds Cows | The Leeds Cows dataset by Derek Magee consists of 14 different video sequences showing a total of 18 cows walking from right to left in front of different backg... | detection segmentation cow video background animal | link | 2013-08-08 | 2114 |
17 | TUD Motorbike | The TUD Motorbike dataset from Bastian Leibe contains 115 images collected from the internet. Each image contains one or more motorbikes at different scales and... | motorbike, detection, pascal | link | 2019-01-29 | 2080 |
16 | PETS 2009 | The PETS 2009 dataset contains 3 parts showing multi-view sequences containing pedestrians walking in an outdoor environment. The parts are used for person coun... | frontview, outdoor, pedestrian, detection, tracking, overlap, occlusion multitarget, human | link | 2018-11-30 | 3414 |
15 | PETS 2006 | The PETS 2006 dataset contains 7 parts showing multi-sensor sequences containing left-luggage scenarios with increasing scene complexity at a train station scen... | frontview, indoor, pedestrian, detection, tracking, multitarget | link | 2019-05-29 | 2755 |
14 | INRIA People | The INRIA People dataset from Navneet Dalal and Bill Triggs [DalalCVPR2005] consists of training and testing data. The training contains 1805 images and X peopl... | detection, pedestrian, sideview, frontview, human, boundingbox | link | 2015-06-19 | 3024 |
13 | CBCL / MIT Pedestrian | MIT Pedestrian dataset from Papageorgiou and Poggio [IJCV2000] contains 509 training and 200 test images of pedestrians in city scenes (plus left-right reflecti... | pedestrian, frontview, detection, urban, people, boundingbox | link | 2024-04-18 | 2810 |
12 | TUD Pedestrians training | The TUD Pedestrians training dataset from Micha Andriluka, Stefan Roth and Bernt Schiele consists of 210 and 400 training images with X pedestrians with signifi... | segmentation, pedestrian, sideview | link | 2013-03-11 | 3437 |
11 | TUD Campus | The TUD Campus dataset from Micha Andriluka, Stefan Roth and Bernt Schiele consists of 71 images and 303 highly overlapping pedestrians with large scale changes... | segmentation, pedestrian, sideview, overlap | link | 2013-03-11 | 2838 |
10 | TUD Pedestrians | The TUD Pedestrians dataset from Micha Andriluka, Stefan Roth and Bernt Schiele [AndrilukaCVPR2008] consists of 250 images with 311 fully visible people with si... | segmentation, pedestrian, sideview | link | 2019-08-15 | 3538 |
9 | TUD Crossing tracking | The TUD Crossing dataset from Micha Andriluka, Stefan Roth and Bernt Schiele consists of 201 images with 1008 highly overlapping pedestrians with significant va... | tracking detection segmentation multitarget pedestrian sideview overlap urban | link | 2015-06-19 | 4957 |
8 | Tools2D | The Tools 2D dataset from Bronstein, Bronstein, Bruckstein, and Kimmel [?] for partial similarity experiments and consists of 15 shapes: 5 humans, 5 horses and ... | shape, binary, matching, retrieval, partial | link | 2014-02-11 | 3201 |
7 | Mythological Creatures | The Mythological Creatures consists of articulated shapes (silhouettes) for partial similarity experiments and contains 15 shapes: 5 humans, 5 horses and 5 cent... | shape, binary, matching, retrieval, partial, animal | link | 2015-06-23 | 3301 |
6 | SIID | The SIID silhouette dataset contains... and is from the Shape Indexing of Image Database (SIID). Download SIID silhouette dataset http://www.lems.brown.edu/... | shape, binary, matching, retrieval | link | 2019-08-26 | 3671 |
5 | KIMA216 | The Kimia 216 has 18 classes each consisting of 12 images. It contains shapes silhouettes for birds, bones, brick, camels, car, children, classic cards, elephan... | shape, binary, matching, retrieval, kimia, animal | link | 2024-10-09 | 4020 |
4 | KIMA99 | The Kimia 99 has 9 classes each consisting of each 11 images. They are part of the Shape Indexing of Image Database (SIID) project, which also contains the SIID... | shape, binary, matching, retrieval, kimia | link | 2015-07-29 | 3326 |
3 | KIMIA25 | The Kimia 25 consists of 6 classes and 25 images. They are part of the Shape Indexing of Image Database (SIID) project, which also contains the SIID silhouette ... | shape binary matching retrieval kimia | link | 2015-08-26 | 3608 |
2 | MPEG-7 Core Experiment CE-Shape-1 | MPEG-7 Core Experiment CE-Shape-1 [?] is a popular database for shape matching evaluation consisting of 70 shape categories, where each category is represented ... | shape, binary, matching, retrieval, bullseye | link | 2017-03-02 | 5055 |