Standard Mechanisms can Explain Grouping in Temporally Synchronous Displays
H. Farid and E.H. Adelson
Investigative Opthalmology and Visual Science (ARVO), Fort Lauderdale, FL, 2000


Purpose. In a recent report, Lee and Blake (Science, 284, 1999) argued that the human visual system can use temporal microstructure to bind image regions into unified objects, as has been proposed in some neural models. Their stimuli were designed in an attempt to remove all classical form-giving cues, so that timing itself would provide the only form cue. They found that observers could see synchrony-defined form, and they posited the existence of special synchrony-sensitive mechanisms and binding processes. However, we believe that the filtering properties of early vision can convert the synchrony information into contrast information, from which standard mechanisms can extract form.

Methods. Lee and Blake's stimuli consisted of two dense regions of randomly oriented Gabor elements, where the Gabor phase randomly shifted forward or backward on each frame. The elements in a central rectangular region changed in synchrony according to a random sequence, while the elements in the background region changed independently. We downloaded several such movies from their web site, and simulated the effects of temporal lowpass and bandpass filtering.

Results. In the filtered movies, the target region's contrast fluctuated noticeably above and below that of the background. Consider the case of temporal lowpass filtering (i.e., simple visual persistence). If a Gabor element undergoes a run of multiple shifts in one direction, its effective contrast is low due to the temporal averaging. Conversely, if it undergoes a run of alternating shifts, its effective contrast remains fairly high because it is ``jittering'' in place. Since the Gabor elements in the target region are synchronized, the effective contrast of the entire region fluctuates en masse, and from one moment to the next can be noticeably different than the background. Similar results hold for bandpass temporal filters.

Conclusions. Lee and Blake's stimuli were cleverly designed to remove form cues from single frames and frame pairs. However, when one considers the full sequence, strong contrast cues can emerge due to the spatio-temporal filtering present in early vision. These cues may well explain the perception of form in these displays, thus obviating the need to posit special grouping mechanisms based on temporal synchrony.


Related material Home     Papers     Research