- Standard Mechanisms can Explain Grouping in Temporally Synchronous
Displays
- H. Farid and E.H. Adelson
- Investigative Opthalmology and Visual Science (ARVO), Fort
Lauderdale, FL, 2000
Purpose. In a recent report, Lee and Blake (Science, 284, 1999)
argued that the human visual system can use temporal microstructure to
bind image regions into unified objects, as has been proposed in some
neural models. Their stimuli were designed in an attempt to remove
all classical form-giving cues, so that timing itself would provide
the only form cue. They found that observers could see
synchrony-defined form, and they posited the existence of special
synchrony-sensitive mechanisms and binding processes. However, we
believe that the filtering properties of early vision can convert the
synchrony information into contrast information, from which standard
mechanisms can extract form.
Methods. Lee and Blake's stimuli consisted of two dense
regions of randomly oriented Gabor elements, where the Gabor phase
randomly shifted forward or backward on each frame. The elements in a
central rectangular region changed in synchrony according to a random
sequence, while the elements in the background region changed
independently. We downloaded several such movies from their web site,
and simulated the effects of temporal lowpass and bandpass filtering.
Results. In the filtered movies, the target region's contrast
fluctuated noticeably above and below that of the background.
Consider the case of temporal lowpass filtering (i.e., simple visual
persistence). If a Gabor element undergoes a run of multiple shifts
in one direction, its effective contrast is low due to the temporal
averaging. Conversely, if it undergoes a run of alternating shifts,
its effective contrast remains fairly high because it is ``jittering''
in place. Since the Gabor elements in the target region are
synchronized, the effective contrast of the entire region fluctuates
en masse, and from one moment to the next can be noticeably different
than the background. Similar results hold for bandpass temporal
filters.
Conclusions. Lee and Blake's stimuli were cleverly designed to
remove form cues from single frames and frame pairs. However, when one
considers the full sequence, strong contrast cues can emerge due to
the spatio-temporal filtering present in early vision. These cues may
well explain the perception of form in these displays, thus obviating
the need to posit special grouping mechanisms based on temporal
synchrony.
|