DE FREITAS, Julian: Rhythmic Multistability

Wikis > Final Projects > DE FREITAS, Julian: Rhythmic Multistability

Rhythmic Multistability

By Julian De Freitas


Multistable percepts have been thoroughly studied in vision, for example, in the form of the Necker Cube or Rubin’s Vase (Moreno, Shapiro, and Rubin, 2010; Necker, 1832). Their main features include 1) they are open to more than one interpretation, and 2) these interpretations are not mutually compatible (Schwartz et al., 2012). More recently, scientists have also been interested in multistability within and across other modalities. There are a number of reasons for branching out in this way, most generally, to understand how stimulus attributes from the different senses are bound by our perceptual systems to create a coherent picture of the world (Schwartz et al., 2012). For instance, much of this work has explored whether bistability is controlled by a central, supramodel mechanism that modulates each of the sensory modalities, or at the local level of the specific modality (Hupé, Joffo, Pressnitzer, 2008; for a more in-depth discussion see Schwartz et al., 2012).


Empirical examples of auditory multistability include auditory streaming (e.g. Denham & Winkler, 2006; Kashino et al., 2007), the verbal transformation effect (Kondo & Kashino, 2007; Sato et al., 2006), and a single recognized instance of ‘bistability’ in rhythmic stimuli (Repp, 2007)— although in this final instance the second percept was introduced via a self-imposed rhythm on the part of the subject, as opposed to being present in the rhythmic stimulus itself. In auditory streaming, subjects are typically presented with a sequential sound triplet (a higher pitch, flanked by two lower pitches), which switches between the percept of a single galloping stream and two separate streams. In the verbal transformation effect, a word (such as ‘life’) is presented repeatedly, and eventually the percept switches to an alternative potential organization of the sound segments (in this case, ‘fly’). The percept then continues to switch between these two interpretations. Subjects in these experiments are typically informed beforehand of the different possible interpretations, and of the fact that their percepts may change. If subjects are not given such a briefing, then it may take much longer for them to experience a percept switch (Hupé & Pressnitzer, 2012). Perceptual switch rates are generally consistent for a given individual, although they may vary considerably among individuals (Kleinschmidt, Sterzer, & Rees, 2012). Furthermore, previous work has suggested (at least in auditory streaming) that percept switches may be induced not by some discrepancy in the stimulus, but by attentional switches largely under control of the subject (Hupé & Pressnitzer, 2012).


Compared to the rich variety of multistable stimuli (both static and dynamic) known in vision, there is a scarcity of multistable stimuli in audition and other modalities (Schwartz et al., 2006). For example, the only known instance of tristable stimuli in audition occurs, arguably, for the verbal transformation effect (Warren, 1961; Warren & Gregory, 1958), despite the fact that the study of tristable stimuli in vision has provided a number of useful insights (Wallis & Ringelhan, 2013). There have been no documented instances of tristability in auditory streaming (Hupé & Pressnitzer, 2012). More generally, multistable stimuli provide useful paradigms to inform larger questions of multisensory binding, the modular organization of multisensory processing in the brain, and so forth. For instance, such questions have previously been approached by searching for communalities in the number of percept switches for visual versus auditory bistable stimuli (Kondo et al., 2011), or by asking whether crossmodal interference occurs between visual and auditory bistable stimuli when these stimuli are presented simultaneously (Hupé et al., 2008). These larger questions continue to be a source of contention (e.g., see Kubovy & Van Valkenburg, 2001; Summerfield, 1987), and one reason for this may be that the visual and auditory multistable stimuli used in these experiments are not sufficiently similar to warrant a valid comparison. Previous research suggests that there are fundamental differences in some of these stimuli, such as between visual plaids and auditory streams (Hupé & Pressnitzer, 2012), and yet these stimuli have previously been compared under the assumption that they are analogous (e.g., Kondo et al). As an example, Hupé et al. (2008) conclude that, since they find no crossmodal interference between auditory streams and visual plaids when these stimuli are presented simultaneously, bistability is not controlled supramodally.


Although there have been few documented cases of auditory multistable percepts in the empirical literature, such percepts might be indirectly alluded to in non-experimental contexts, such as in the music theoretic literature. For instance, Cooper & Meyer (1960, p.32) refer to “ambiguous rhythms”: “it is difficult to decide which grouping is the dominant or manifest one.” Furthermore, it has been noted in the analysis of the hemiola musical figure, that it is possible for a given rhythm to be heard in more than one meter (e.g. Taylor, 2012). However, in such cases it is unclear whether shifts in the percept can occur without the help of a disambiguating context, whereas the most interesting aspect of standard multistable stimuli is that multiple percepts are produced by the very same, unmodified stimulus.


In this paper, I will introduce a new kind of auditory multistable stimulus that relies on rhythmic figures (Bamberger, 1982; Smith, Cuddy, & Upitis, 1994). The inspiration for these stimuli derives partly from music listening, during which there are rare occasions when one’s percept of the music flips. The reason such multistable stimuli may only be rarely encountered is that the contexts in which they occur serve to disambiguate between the different possible percepts, whereas the multiplicity of such stimuli can be better appreciated when they are presented in isolation (Klink, Wezel, & van Ee, 2012). Furthermore, the experience of multistability is likely more common for certain musical styles, such as in African polyrhythmic music (e.g. Locke, 2010; Temperley, 2000).


As I will aim to show in a series of experiments, the advantages of documenting a new stimulus of this kind are manifold: 1) it provides a new general tool for studying the kinds of questions highlighted above, 2) its stability can be manipulated so that it is tristable, quadrastable, and potentially an even greater number of stable percepts, 3) the strength of its multistability can be increased or decreased by changing the speed at which the rhythm is played, or by creating rhythmic figures that have either stronger or weaker ambiguity (via manipulating specific cues in the rhythm itself).


Experiments (to be run either online or in-person):

In designing the stimuli, I chose those that intuitively produced the desired number of percepts. In order to create different potential arrangements of sound sequences in the first place, rest pauses were introduced between different groupings of notes.


Experiment 1: This experiment will aim to show that such a thing as bistable rhythmic figures exist. Subjects will be informed before the experiment of the different possible percepts, and that they may experience shifts among these different percepts. Subjects will be played bistable rhythmic figures, and will have to note during the trial each time which interpretation they are currently hearing. In addition, as a more objective way of measuring subject’s percepts, subjects will perform the target response task used in De Freitas et al. (2013): Even though the rhythmic stimulus will remain the same, we should predict that the object-based attention effect will go in the direction of the object percept subjects are currently experiencing. As a means of ensuring different percepts (at least at the beginning of the trial), different trials will start the rhythm so as to bias interpretations toward either one percept or the other. Aside from providing an objective measure of percept switches, a positive result in this experiment would also suggest that awareness is necessary for temporal object-based attention (which would probably be the title of the paper).


Experiment 2: (This would likely be its own paper, focused more on music theoretic issues). This experiment would more deeply explore the features of bistable rhythmic patterns, and their relation to the underlying meter. One option would be to explore the effects of different rhythmic tempos on the number of percept switches.


Experiment ‘n’: After thoroughly investigating bistable rhythmic figures, I would investigate the reality of tristable rhythmic figures, including the question of how percepts of these rhythms change as the rhythm is played at different speeds.




Bamberger, J. (1982). Revisiting children’s drawings of simple rhythms: A function of reflection-in-action. In: S. Strauss (Ed.), U-shaped behavioral growth (pp. 191–226). Toronto: Academic Press.


Cooper, G. W., Meyer, L. B. (1960). The rhythmic structure of music. Chicago, IL: University of Chicago Press.


De Freitas, J., Liverence, B. M., Scholl, B. J. (in press). Attentional Rhythm: A temporal analogue of object-based attention. Journal of Experimental Psychology: General.


Denham, S. L. & Winkler, I. (2006). The role of predictive models in the formation of auditory streams. Journal of Physiology: Paris, 100, 154–170.


Hupé, J-M., Joffo, L-M., Pressnitzer, D. (2008). Bistability for audiovisual stimuli: Perceptual decision is modality specific. Journal of Vision, 7, 1–15.


Hupé, J-M., Pressnitzer, D. (2012). The initial phase of auditory and visual scene analysis.Philosophical Transactions of the Royal Society of Biology, 367, 942–953.


Kashino, M., Okada, M., Mizutani, S., Davis, P. & Kondo, H. M. (2007). The dynamics of auditory streaming: psychophysics, neuroimaging, and modeling. In Hearing—from sensory processing to perception (eds B. Kollmeier, G. Klump, V. Hohmann, U.


Kleinschmidt A., Buchel C., Zeki S., Frackowiak R. S. J. (1998). Human brain activity during spontaneously reversing perception of ambiguous figures. Proceedings of Royal Society of London: Biological Sciences, 265, 2427–2433.


Kleinschmidt, A., Sterzer, P., Rees, G. (2012). Variability of perceptual multistability: from brain state to individual trait. Philosophical Transactions of the Royal Society: Biological Sciences, 367, 988–1000.


Klink., P. C., van Wezel, J. A., van Ee, R. (2012) United we sense, divided we fail: context-driven perception of ambiguous visual stimuli. Philosophical Transactions of the Royal Society: Biological Sciences, 367, 932–941.


Kondo, H. M. & Kashino, M. (2007) Neural mechanisms of auditory awareness underlying verbal transformations. NeuroImage, 36, 123–130.


Kondo, H. M., Kitagawa, N., Kitamura, M. S., Koizumi, A., Nomura, M., Kashino, M. (2011). Separability and commonality of auditory and visual bistable perception. Cerbral Cortex.doi:10.1093/cercor/bhr266.


Kubovy, M., & Van Valkenburg, D. (2001). Auditory and visual objects. Cognition, 80, 97–126.


Langemann, M. Mauermann, S. Upperkamp & J. Verhey), pp. 275– 283. Berlin, Germany: Springer.


Necker, L. A. (1832). Observations on some remarkable optical phenomena seen in Switzerland; and on an optical phenomenon that occurs on viewing a figure of a crystal or geometrical solid. Philosophical Magazine and Journal of Science, 1, 329–337.


Moreno-Bote, R., Shapiro, A., Rinzel, J. & Rubin, N. (2010). Alternation rate in perceptual bistability is maximal at and symmetric around equi-dominance. Journal of Vision, 10, 1.


Poudrier, È., Repp, B. H. (2013). Can musicians track two different beats simultaneously.Music Perception, 4, 369–390.


Repp, B. H. (2007). Hearing a melody in different ways: Multistability of metrical interpretation, reflected in rate limits of sensorimotor synchronization. Cognition 102, 434–454.


Sato, M., Schwartz, J. L., Abry, C., Cathiard, M. A., & Loevenbruck, H. (2006) Multistable syllables as enacted percepts: A source of an asymmetric bias in the verbal transformation effect. Perception & Psychophysics, 68, 458–474.


Schwartz, J-L., Grimault, N., Hupé, J-M., Moore, B. C. J., Pressnitzer, D. (2012).Philosophical Transaction of the Royal Society: Biological Sciences, 367, 896–905.


Smith, K. C., Cuddy L. L., Upitis, R. (1994). Figural and metric understanding of rhythm. Psychology of Music, 22, 117–135.


Summerfield, Q. (1987). Some preliminaries to a comprehensive account of audio-visual speech perception. In Hearing by eye: the psychology of lip-reading (eds B. Dodd & R. Campbell), pp. 3–51. London, UK: Lawrence Erlbaum Associates.


Taylor S. A. (2012). Hemiola, maximal evenness, and metric ambiguity in late Ligeti.Contemporary Music Review, 31, 203–220.


Temperley, D. (2000). Meter and grouping in African music: A view from music theory.Ethnomusicology, 44, 65–96.


Wallis, G., Ringelhan, S. (2013). The dynamics of perceptual rivalry in bistable and tristable perception. Journal of Vision, 13, 1–24.


Warren, R. M. (1961). Illusory changes of distinct speech upon repetition—The verbal transformation effect. British Journal of Psychology, 52, 249–258.


Warren, R. M., & Gregory, R. L. (1958). An auditory analogue of the visual reversible figure.The American Journal of Psychology, 71, 612–613.