|Description (include details on usage, files and paper references)||The BEOID dataset includes object interactions ranging from preparing a coffee to operating a weight lifting machine and opening a door. The dataset is recorded at six locations: kitchen, workspace, laser printer, corridor with a locked door, cardiac gym and weight-lifting machine. For the first four locations, sequences from five different operators were recorded (two sequences per operator), and from three operators for the last two locations (three sequences per operator).
The wearable gaze tracker hardware (ASL Mobile Eye XG) was used to record the dataset. Synchronised wide-lens video data with calibrated 2D gaze fixations are available. Moreover, we release 3D information using a pre-built cloud point map and PTAM tracking. Three-dimensional information of the image and the gaze fixations are included.
This dataset is released alongside our publication:
** Damen, Dima and Leelasawassuk, Teesid and Haines, Osian and Calway, Andrew and Mayol-Cuevas, Walterio (Sep 2014). You-Do, I-Learn: Discovering Task Relevant Objects and their Modes of Interaction from Multi-User Egocentric Video. British Machine Vision Conference (BMVC), Nottingham, UK **
We thus provide ground-truth for task-relevant object discovery. This is in the form of narration written by the users themselves, as well as manual ground-truth of three-dimensional bounding boxes indicating locations of fixed and moveable task relevant objects in the scene. Our method and results can be found at:
In this work, we showcase video help guides using inserts on a pre-recorded video. These inserts are extracted, selected and displayed fully automatically. An overview of the method can be found at: http://youtu.be/vUeRJmwm7DA