Neural properties of the 3D reference frame
transformation for reaching as predicted and explained by an artificial
network
Gunnar
Blohm, Gerald P. Keith, J. Douglas Crawford
CESAME, University catholique de Louvain, Louvain-la-Neuve, Belgium
CVR, York University, Toronto, Canada
Reaching requires a transformation of visual signals related to initial
hand and target positions into motor commands to move the arm. This
visuomotor transformation involves extraretinal eye and head position
signals and is computationally complex, in particular because of the 3D
rotational and translational geometry involved (Blohm & Crawford, J
Vis 2007). It is currently under debate where in the brain this
visuomotor transformation occurs and what signals we would expect to
find in areas involved in this process. Despite several general
theoretical descriptions, no one has ever trained a neural net to
perform the 3D transformation and analyzed its properties to develop a
theoretical framework for the actual physiology. Here we trained a
physiologically realistic artificial neural network to perform the
visuomotor transformation and demonstrate how 3D reference frame
transformations can be performed in distributed processing.
We used a fully connected 4-layer feed-forward artificial neural
network. The input layer had 2D retinal position maps of target and
initial hand position, 2D maps of retinal disparity of target and
initial hand position and extraretinal 3D eye and 3D head position
signals as well as a 1D vergence input. These inputs all projected to
the second (hidden) layer that in turn projected to a population output
(3rd) layer. We reconstructed the resulting movement vector in the
behavioral read-out (4th) layer that hypothesized cosine tuning in the
population output and used an optimal linear estimator. The network was
trained (resilient back-propagation) on the theoretically computed
exact relationship between a random gaze-centered input and its
corresponding shoulder-centered movement vector output.
We analyzed the network properties using an electrophysiological
approach. In order to investigate how the complete reference frame
transformation could be computed at the single unit level, we
characterized each units input and output reference frame for the
hidden (2nd) and population output (3rd) layers. The input reference
frame was assessed by observing how visual receptive fields were
modulated with extraretinal eye-head-vergence position. Hidden layer
units were purely gaze-centered and showed gain modulation whereas the
population output layer had mixed reference frames (between gaze- and
shoulder-centered). We used two techniques to find the output reference
frame for each unit, i.e. motor fields and simulated micro-stimulation,
combined with eye-head-vergence modulation. The motor fields showed
mixed reference frames for both the hidden and population output
layers. However, simulated micro-stimulation showed mixed reference
frames in the hidden layer and purely shoulder-centered coordinates in
the population output layer.
To summarize, the complete reference frame transformation between gaze-
to shoulder-centered coordinates was performed gradually at the single
unit level. Partial transformations in single units are then weighted
by extraretinal eye-head-vergence signals in a gain-like fashion and
combined at the population level. We also demonstrate how relative
distance is gradually transformed into absolute distance. Importantly,
different reference frames can be found in the same area if assessed by
different techniques. This work reconciles many apparently
contradictory findings that have reported different reference frames in
a particular area but when investigated using different
electrophysiological techniques. This model also makes new predictions
as to what signals we should find in areas involved in the 3D reference
frame transformation for reaching and how such an area could be
identified based on its neural properties.
Supported by Marie Curie (EU) and CIHR
(Canada)