This website provides a list of frequently used computer vision datasets. Wait, there is more!
There is also a description containing common problems, pitfalls and characteristics and now a searchable TAG cloud.
Plus, this is open for crowd editing (if you pass the ultimate turing test)! - Questions? yacvid [at] hayko [dot] at
Content, Design and Idea © by Hayko Riemenschneider, 2011-2016. Texts and Images are subject of copyright by the respective authors.
Hey! If you're reading this, why not help and update the description of the dataset you're working on?
Add a new dataset
«showing 529 tags of 529 total tags for 378 datasets (1.4) »
|371||ICS-FORTH MHAD101 Action Co-segmentation||This is a custom generated dataset designed for the task of action co-segmentation in pairs of action sequences. The dataset contains 101 pairs of action se...||action cosegmentation, temporal segmentation, motion-capture-data, time-series||link||2017-04-05||53|
|366||Multi-Camera Action Dataset||An indoor action recognition dataset which consists of 18 classes performed by 20 individuals. Each action is individually performed for 8 times (4 daytime and ...||Multi-Camera; Action Recognition; Cross-View Recognition; Open-View Recognition;||link||2017-03-22||78|
|355||IMPART multi-modal/multi-view||The multi-modal/multi-view datasets are created in a cooperation between University of Surrey and Double Negative within the EU FP7 IMPART project. The sourc...||multi-view multi-mode video rgbd lidar 3d model color indoor outdoor dynamic action face human emotion||link||2017-01-01||166|
|305||SPHERE human skeleton movements||The SPHERE human skeleton movements dataset was created using a Kinect camera, that measures distances and provides a depth map of the scene instead of the clas...||human action behavior motion movement video skeleton depth kinect||link||2016-03-24||562|
|276||TST TUG (Timed Up and Go)||The TUG (Timed Up and Go test) dataset consists of actions performed three times by 20 volunteers. The people involved in the test are aged between 22 and 39, w...||action recognition time kinect wearable accelerometer human video||link||2015-05-02||442|
|275||TST fall detection||It is composed of ADL (activity daily living) and fall actions simulated by 11 volunteers. The people involved in the test are aged between 22 and 39, with diff...||action recognition detection depth kinect wearable accelerometer human video||link||2017-03-14||770|
|272||Stanford 40 Actions||The Stanford 40 Actions dataset contains images of humans performing 40 actions. In each image, we provide a bounding box of the person who is performing the ac...||human action recognition detection boundingbox||link||2015-06-19||702|
|264||Domain-specific Personal Videos Highlight Dataset||The domain-specific personal videos highlight dataset from the paper  describes a fully automatic method to train domain-specific highlight ranker for raw p...||video summarization saliency wearable human action recognition domain||link||2015-05-02||569|
|261||MPI Multi-View Collection GVV datasets||Welcome to the homepage of the gvvperfcapeva datasets. This site serves as a hub to access a wide range of datasets that have been created for projects of the G...||video multiview tracking face mesh reconstruction depth human action pose||link||2014-12-10||588|
|252||Volleyball Activity Dataset 2014||This dataset contains 7 challenging volleyball activity classes annotated in 6 videos from professionals in the Austrian Volley League (season 2011/12). A total...||action activity sport volleyball detection recognition video analysis||link||2014-10-23||1189|
|248||VIDEO datasets overview||Many different labeled video datasets have been collected over the past few years, but it is hard to compare them at a glance. So we have created a handy spread...||video benchmark recognition classification detection object action||link||2014-09-30||966|
|245||ETHZ CVL Video SumMe||The Video Summarization (SumMe) dataset consists of 25 videos, each annotated with at least 15 human summaries (390 in total). The data consists of videos, anno...||video summary benchmark human groundtruth action event||link||2016-10-21||1055|
|235||Kindergarten Video Surveillance||The dataset consist of the about 50 hours obtained from kindergarten surveillance videos. Dataset, totally approximately 100 videos sequences (1000GB, 50 hours)...||human action behavior segmentation video background surveillance||link||2015-10-08||1005|
|219||JPL First-Person Interaction||JPL First-Person Interaction dataset (JPL-Interaction dataset) is composed of human activity videos taken from a first-person viewpoint. The dataset particularl...||video action recognition interactive motion human||link||2014-02-03||562|
|207||CASIA Gait Recognition Dataset||Dataset A (former NLPR Gait Database) was created on Dec. 10, 2001, including 20 persons. Each person has 12 image sequences, 4 sequences for each of the three ...||gait recognition biometry action classification motion human foot pressure||link||2017-03-10||1994|
|201||50 Salads||The dataset captures 25 people preparing 2 mixed salads each and contains over 4h of annotated accelerometer and RGB-D video data. Annotated activities correspo...||action activity recognition classification detection tracking video||link||2013-10-05||748|
|185||Kung-Fu fighter Multi-View||The test sequences provide interested researchers a real-world multi-view test data set captured in the blue-c portals. The data is meant to be used for testing...||multiview tracking segmentation camera action||link||2013-10-08||762|
|182||MSR Action||The MSR Action datasets is a collection of various 3D datasets for action recognition. See details http://research.microsoft.com/en-us/um/people/zliu/action...||video action recognition detection reconstruction 3d||link||2013-09-05||772|
|173||MuHAVi and MAS human action||The Multicamera Human Action Video Data (MuHAVi) Manually Annotated Silhouette Data (MAS) are two datasets consisting of selected action sequences for the eval...||human action behavior segmentation video background||link||2017-04-04||1241|
|171||CHALEARN Multi-modal Gesture Challenge||The CHALEARN Multi-modal Gesture Challenge is a dataset +700 sequences for gesture recognition using images, kinect depth, segmentation and skeleton data. ht...||gesture, kinect, recognition, human, action, illumination, depth, segmentation, skeleton||link||2013-08-09||684|
|170||Shefﬁeld Kinect Gesture (SKIG) dataset||The Shefﬁeld Kinect Gesture (SKIG) dataset contains 2160 hand gesture sequences (1080 RGB sequences and 1080 depth sequences) collected from 6 subjects. ...||gesture, kinect, recognition, human, action, illumination, depth||link||2013-08-09||772|
|153||MSRC Kinect Gesture Dataset||The Microsoft Research Cambridge-12 Kinect gesture dataset consists of sequences of human movements, represented as body-part locations, and the associated gest...||gesture, kinect, recognition, human, action||link||2013-08-08||818|
|141||Berkeley Multimodal Human Action Database (MHAD)||The Berkeley Multimodal Human Action Database (MHAD) contains 11 actions performed by 7 male and 5 female subjects in the range 23-30 years of age except for on...||action classification multiview motion recognition||link||2014-02-03||767|
|42||Hollywood Videos||Hollywood-2 datset contains 12 classes of human actions and 10 classes of scenes distributed over 3669 video clips and approximately 20.1 hours of video in t...||action, classification, video, segmentation||link||2013-03-12||893|
|41||KTH Action||The current video database containing six types of human actions (walking, jogging, running, boxing, hand waving and hand clapping) performed several times by 2...||action, classification, video, segmentation||link||2013-03-12||597|
|40||Weizmann Action||The Weizmann actions dataset by Blank, Gorelick, Shechtman, Irani, and Basri consists of ten different types of actions: bending, jumping jack, jumping, jump in...||video, segmentation, action, classification||link||2015-07-14||639|