Online Experiments Demos

Survey at least one of the online experiments listed below. Write a short evaluation report about the goals and methods of the experiment (1-2 paragraphs). To the best of your knowledge, how well did the design fit the experimental goal? (These are not always provided; some present the goal(s) before, and some after.) Based on your exploration, what may be some of the limitations of the methods use? NOTE: Your project group will benefit most from assigning a different demo to explore to each group member. Please post your response here by April 11.

The Music Cognition Group at the University of Amsterdam has a few online experiments that you can test (click “Online Experiments” on left side menu):

Two of the online experiments are available in English:

– Rhythmic complexity:

– Timing and tempo in classical, jazz and rock music:

Both of these will require you fill a questionnaire before you start; you can fill those randomly. Once the questionnaire is submitted you should be able to proceed to the experiment proper.

9 thoughts on “Online Experiments Demos

  1. Alex Roth: “I participated in the Universiteit van Amsterdam’s Music Cognition Group’s rhythmic complexity perception study. From the “Welcome to the listening experiment!” page I was able to gather that the general goal in conducting this study was to more fully understand perception of rhythmic complexity, but the experimental interface did not provide information on any other more specific goals. Nor were the inner workings/thought processes behind the experimental methods elucidated upon at any point. I was presented with 7 clusters of rhythms each containing 2-4 rhythm patterns and asked to rank rhythmic complexity (defined as “feeling of tension or violation of expectation or deviation from regular pattern or nonpredictability of events”) within groups, but I ultimately don’t know how these data were to be compared/analyzed. At the end of the experiment, I was directed to a page that claimed “we are sorry we can’t show online results here,”
    and I was displeased I had no access to the information that would allow me to more fully understand the experiment in which I just participated.

    Perhaps when this experiment was live and data was still being collected participants may have had access to this experimental goals/methods information. However, if this was the case, it would’ve been wiser for the Music Cognition Group to leave this information
    still accessible so future researchers (e.g., myself) could learn from this research and apply its lessons in their own research moving forward.”

  2. Gideon Broshy: “I tried the test on rhythmic “complexity” from the Music Cognition Group in Amsterdam ( The test indicated that their aim was to record subjective measures of musical “complexity” (which they define as unpredictability, or violation of expectation); the paper that resulted (available on the Group’s website) reveals that the ultimate aim of the test was to see whether non-musicians and musicians alike judge salient events in metrically strong positions to be more or less unexpected than salient events on metrically weak positions.

    The test asks subjects to listen to clusters of 3 or 4 rhythms and to rate them from most to least “complex.” The task is easy and intuitive, and most likely produced an easily manageable set of data. It also tackles the question at hand in a fairly direct way. I found two problems with it, though. First, their definition of complexity (as violation of expectation, or unpredictability) is a bit counterintuitive, and was explained only briefly. Perhaps they would do better to dispense with the word “complexity,” as it may have (as it did for me) suggested something rather different, in the minds of the subjects, than what was intended. Secondly, I found myself judging the rhythms that came later in each group as less complex/unpredictable than I would have had I not heard other rhythms in the group. That is, hearing the first rhythm within a group made the other rhythms in the group sound less “complex.” This may have interfered with the task/judgment that the researchers were really testing for.”

  3. Honing, H., & Ladinig, O. (2009). Exposure influences expressive timing judgments in music. Journal of Experimental Psychology: Human Perception and Performance, 35(1), 281-288.

    Catherine Jameson: “The goal of this experiment was to figure out, ” whether, and if so to what extent, listeners’ previous exposure to music in everyday life, and expertise as a result of formal musical training, play a role in making expressive timing judgments in music” (from abstract). The stimuli consisted of pairs of songs from three different genres (jazz, rock, and classical), one of which had been slowed down or sped up from its original tempo to match the tempo of the other piece. The listener’s task was then to decide which was the “real” song, and also to indicate how sure they were of their answer and whether or not they were familiar with the song. Each participant did this for 15 pairs, 5 in each category. The authors found that familiarity with the genre played a role in ability to discriminate, but musical experience had no effect, which suggests that exposure to musical genres allows listeners to become experts in the microtimings of each style.

    This study was well designed in that it used ecologically valid stimuli, which allows the results to be more generalizeable to real circumstances of music listening. The stimuli format, a song sped up or slowed down to match the tempo of another song, is clever and implicitly gets at the root of the question of microtiming. However, this study doesn’t necessarily get at expressive timing, as that might vary from artist to artist and from day to day or performance to performance. It better represents a study of genre-specific patterns of microtiming.

    A limitation of this method would be the lack of control over the microtiming of each piece. In the case of this experiment, the data gathered represents participants overall perception of each piece as opposed to their specific reaction to precise microtimings. Even though it makes the data messier, though, it is not necessarily bad, as the results are closer to real-time real-life reactions (more ecologically valid).”

  4. I participated in the study of factors influencing rhythmic complexity by the Music Cognition Group at the University of Amsterdam ( This study seems to be attempting to evaluate the properties that make a rhythm more or less “complex” – that is, it seems to be trying to develop a sort of operational definition of rhythmic complexity by defining the factors of a rhythm that can make it complex. The survey asks subjects to compare the complexity of varying numbers of sound clips; each clip repeats a particular rhythm 4 times alternating between a high and low MIDI drum sound. Subjects are asked to compare the rhythms in a particular group, ranking them in order of complexity (or rating them “all equal”). The study also included a questionnaire about musical training (and specifically percussion instrument training).

    Because the study seeks to investigate the properties that make a rhythm complex, I thought it undermined the purpose of the study a bit to define rhythmic complexity in the provided survey instructions. The researchers clearly suggest that rhythmic complexity is defined by a violation of expectation or unpredictability in a rhythm. By including this “definition,” the study actually switches its focus from rhythmic complexity to rhythmic predictability, which may very well be two different properties of rhythm. In this way, it seems that the study may have confounded itself.

  5. Harrison Davis

    I participated in the “Timing and tempo in classical, jazz and rock music” experiment. When I pressed on the link I was immediately checked for quicktime function, before I was turned to a fairly long questionnaire that sought to establish the depth and extent of my musical experience before I actually performed the main task. For the most part I felt that the questionnaire fulfilled this function quite well. However, I felt that one of the questions, in which they requested I check off the musical genres I am familiar with, was incomplete. Most of the genres I am most familiar with were ignored. Thus the validity of judging my musical experienced based on those responses is questionable.

    Then I performed the task laid out for me. This application was quite well designed. Compositions of Classical, Jazz, and Rock genres were listed in pairs of two separate and I had to judge which compositions, according to instructions given just prior, was not manipulated in the way of tempo. The means of response were simple and worked quite well. In terms of design my only problems were that I felt the question “do you like this piece?” was somewhat arbitrary.

    Speaking of irrelevance, the major problem I had with the overall process was that I found the question somewhat vague. Essentially the question stated, “Can you hear whether an audio fragment is a real performance or a manipulated, tempo-transformed version of it?” If this is the case, I don’t know why such a long questionnaire about my musical background was used. If these questions were the main focus, perhaps a better question would be “Can you hear whether an audio fragment (is real or fake – same as original) better when the piece is in a style you are more familiar with than when the piece is in a style you are not as familiar with?”

    • I also participated in this experiment. Aside from the instructions Harrison has already described, the experimenters also mention that “all fragments are processed in some way, so please ignore sound quality as a possible cue for deciding which is which.” In practice, I found this almost impossible to do (it’s a little like being asked not to think of a pink elephant)– especially since the task was so hard to do. And, besides, even if I had explicitly made an effort not to listen for such cues, in some cases the they were so obvious that there was no way I was not going to be influenced by them, for example, when I heard someone sneezing during one of the recordings. Mmm.

      The varied nature of these pieces certainly makes them ecologically valid, although it doesn’t seem (at least on the surface) that the authors tried to control for many basic factors, such as loudness, sound quality, tambour, etc; indeed, sometimes the songs being compared even seemed quite different. Perhaps this was the idea, but I would be very careful in how I interpreted any results from this study, given the possibility that responses could have been influenced by a number of different cues.

      One design feature of the study that I thought could have been improved (again, without knowing what it was they were testing) was the sound playing and response recording interface. At present, it is possible for a subject in this experiment to stop the recordings whenever they want, and even to respond without having heard the recordings. If subjects do not follow instructions (e.g., if they do not actually listen to the pieces), then this may create a lot of noise in the data. The experimenters should have constrained things so that you had to listen to the full recording before being able to make a response.

      Without knowing exactly what the study was trying to test, it is hard for me to say anything more substantive. I would have known, had the experimenters provided this debriefing information on the website itself, instead of requiring me to provide an email address. Of course, for fear of spam, I did not provide my real email address. Normally, the IRB will not allow experimenters to get away with this, so I do not know how they did.

  6. I completed the rhythmic complexity experiment from the Music Cognition Group in Amsterdam ( I think that the design fit the experimental goal of better understanding individuals’ perception of rhythmic complexity. They grouped two to four rhythms together and asked me to rank the rhythmic patterns in terms of complexity, which they defined before the experiment to be “a feeling of rhythmical tension, the violation of your expectation, a deviation of a regular rhythmic pattern, or non-predictability of events.” I do think that over the course of the experiment, some of the rhythms got somewhat repetitive across each group, which made it difficult to remember which rhythms were in which group. Since they asked me to rank the rhythmic pattern relative to the other rhythmic patterns in the group, this may have distorted my judgments of which patterns I perceived to be the most complex. I would suggest that rather than asking for relative rankings, they ask for a singular evaluation of “how complex” each rhythm is on a scale, say from 1 to 10. In this way, they would better avoid how individuals’ memory for rhythm distorts how complex they perceive it to be. Overall, I think the study has a useful purpose in that the research looks to better define the specific rhythmic characteristics that determine the complexity of a rhythmic pattern.

  7. Jenner Fox: “I did the experiment where participants were asked to discriminate between two music clips (classical, jazz, or rock) on the basis of whether they had undergone a tempo manipulation. We were presented with two music clips from the same genre and asked which one was not altered, if we were sure of our choice, if we were familiar with the piece, and if we liked the piece. The goal of the experiment was to see if participants could tell if tempos had been manipulated, and if familiarity with the pieces affected the results. Another goal of the experiment was to investigate what genre of music was easiest for people to tell if the tempo had been altered.

    There are several limitations of the method used. First, the experimenters could not control for what pieces people did or did not know, so the results across participants could have been skewed. Second, I think the experiment would be more valid and interesting with more genres of music. I especially think they should have had a pop music section, because more people would have been familiar with that music. Finally, the experiment could more specific, if they delved deeper into the familiarity with certain pieces. For instance, if people were familiar with a piece they could ask if they had just heard it once or if they had practiced and performed it. Overall, I thought the experimental design was good, even though the task was very difficult. They did a good job of designing based on the constraints of the online platform.”

  8. I completed the Timing and tempo in classical, jazz and rock music online Demo.

    I liked their construction of the experiment, where the first section was basic survey and open ended, and the second section was entirely controlled binary answers. The binary Yes-No set up would be quick to analyze and hopefully lead to definite answers. My main two critiques would be one, to learn what their reasoning behind asking if they were “sure” of their answer. Perhaps it was to get the subject you really think about their response. Second, they should clarify what they mean by “are you familiar with this composition?”. Since it’s a Yes-No answer, if you vaguely recognize the song but cannot say the name or composer, does that count?

Comments are closed.