Snapshots of third- and first-person-views. Gaze is highlighted in red circles.

VEDI Project

Abstract: The postdoc research fellowship named Scene Understanding and Behavioural Analysis from Egocentric Visual Data is part of the Vision Exploitation for Data Interpretation (V.E.D.I.) project granted by CUTGANA (University of Catania). Particularly, in this fellowship we will manage egocentric visual data gathered in environments like natural reserves, parks or areas with deep vegetation. We chosen the botanical garden of the University of Catania as case study for our experiments. We may distinguish several aims in our issue: scene understanding, behavioural analysis and gaze prediction. What is challenging in our goals? First of all, we choose an environment that is almost totally outdoor. This context is more difficult as we need to handle factors like variable light conditions, the environment changes w.r.t. seasons and weather, and it is very difficult to perform scene classification as the scenes are made by repetitive visual features (i.e., leaves and small plants). For these reasons, we will employ gaze fixations in order to refine analysis of gathered egocentric visual data.

Behavioral analysis will be conducted employing egocentric visual data, augmented with user gaze fixations acquired by a Pupil 3D Eye Tracker device. We are interested in investigating a way to relate gaze fixations to behavior of casual and expert users visiting the botanical garden. For instance, possible questions supporting this research topic may be: Is the arrangement of the plants in the garden designed in a good way or it may be improved someway?, Are trails realized for visitors placed in good spots or they should be moved in order to give more visibility to specific plants?, or What are places in which users spend most of the time looking around them?.

Finally, gaze prediction represents the most difficult task: we are planning to employ gaze fixations to train a model designed to predict where users will look after viewing to our scenes and points of interest. As described in the literature, gaze prediction is difficult because it can be seen as an open problem. Indeed, try to predict where someone will focus his attention (issue known as Focus of Attention prediction is a task strictly related to the action performed by the user in that moment, where the set of possible actions is potentially unlimited. We will try to limit the number of possible actions to a small set.

Related Publications:

  • Investigation on this topic is currently ongoing.