Technical Program

Paper Detail

Paper: PS-1B.5
Session: Poster Session 1B
Location: H Fläche 1.OG
Session Time: Saturday, September 14, 16:30 - 19:30
Presentation Time:Saturday, September 14, 16:30 - 19:30
Presentation: Poster
Publication: 2019 Conference on Cognitive Computational Neuroscience, 13-16 September 2019, Berlin, Germany
Paper Title: DeepGaze III: Using Deep Learning to Probe Interactions Between Scene Content and Scanpath History in Fixation Selection
Manuscript:  Click here to view manuscript
License: Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 Unported License.
Authors: Matthias Kümmerer, Thomas S.A. Wallis, University of Tübingen, Germany; Matthias Bethge, University of Tübingen and Bernstein Center for Computational Neuroscience Tübingen, Germany
Abstract: Many animals make eye movements to gather relevant visual information about the environment. How fixation locations are selected has been debated for decades in neuroscience and psychology. One hypothesis states that "priority'" or "saliency" values are assigned locally to image locations, independent of saccade history, and are only later combined with saccade history and other constraints to select the next fixation location. A second hypothesis is that there are interactions between saccade history and image content that cannot be summarised by a single value. Here we discriminate between these possibilities in a data-driven manner. Using transfer learning from the VGG deep neural network, we train a model of scanpath prediction "DeepGaze III" on human free-viewing eye scanpath data. DeepGaze III can either be forced to use a single saliency map or can be allowed to learn complex interactions via multiple saliency maps. We find that using multiple saliency maps gives no advantage in scanpath prediction compared to a single saliency map. This suggesest that -- at least for free-viewing -- no complex interactions between scene content and scanpath history exist and a single saliency map may exist that does not depend on either current or previous gaze locations.