Technical Program

Paper Detail

Paper: PS-1B.23
Session: Poster Session 1B
Location: H Fläche 1.OG
Session Time: Saturday, September 14, 16:30 - 19:30
Presentation Time:Saturday, September 14, 16:30 - 19:30
Presentation: Poster
Publication: 2019 Conference on Cognitive Computational Neuroscience, 13-16 September 2019, Berlin, Germany
Paper Title: Analyzing disentanglement of visual objects in semi-supervised neural networks
Manuscript:  Click here to view manuscript
License: Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 Unported License.
Authors: Andrew David Zaharia, Benjamin Peters, John Cunningham, Nikolaus Kriegeskorte, Columbia University, United States
Abstract: A fundamental goal of visual systems is to condense images into compact representations of the relevant information they contain. Ideally, these representations would consist of the independent “generative factors” that fully determine, on a semantic level, the visual input. Such a “disentangled” representation could consist of the identity of a background scene, and the identity, position, pose, and size of an object. Recent research in deep neural networks (DNNs) has focused on achieving disentangled representations, through unsupervised learning, of single objects or faces in isolation. We trained and analyzed a popular DNN model of disentanglement, the β-variational autoencoder (β-VAE) on a new dataset, containing a “foreground” white circle and “background” isotropic Gaussian. We show that the neural network autoencoder architecture we use can achieve a perfectly disentangled latent representation with supervised learning, but only achieves partial disentanglement when using the unsupervised β-VAE loss function. On our dataset, higher β values result in higher reconstruction loss and greater entanglement. We propose that further inductive bias is needed to achieve better disentanglement, such as a representation which factorizes static properties and their dynamics.