Running head: A NEURAL ASSESSMENT OF AUDITORY INDUCTION
From Magicians to Broken Dishes: A Neural Assessment of Auditory Induction
Dana L. Schuler
University of California, Los Angeles
From Magicians to Broken Dishes: A Neural Assessment of Auditory Induction
Many people have seen a magician saw his assistant in half and perform a variety of other impossible feats. While we know that these acts are impossible, they still seem to be happening right in front of us. This leaves almost all of us wondering how the magician tricked our senses into perceiving the impossible; in other words, how did he create his illusion?
Interestingly, the human auditory system seems to share the ability to create illusions with magicians. Humans can imagine flashes of light where there are none (Shams, Kamitani and Shimojo, 2000, 2002) and hear one sound while another is actually present (Warren, 1970; Warren, Obusek and Ackroff, 1972; Warren and Warren, 1970). The McGurk Illusion is an example of an audio-visual illusion. This illusion uses a video of a person repeatedly mouthing the syllable “Ga”, “Ga”, “Ga”, while the accompanying audio repeats the sound “Ba”, “Ba”, “Ba”. Most subjects reported hearing the sound as “Da”, “Da”, “Da”. The visual information from the person's lips tricked the subjects’ auditory systems into hearing something that wasn't there (McGurk and MacDonald, 1976). This audio-visual illusion can work the other way, with auditory information influencing visual perception, such as in the Fission effect. Subjects saw a single flash of light as two distinct flashes if the flash was accompanied by two auditory beeps (Shams, Kamitani and Shimojo, 2000, 2002).
Auditory illusions do not always require visual stimuli. For instance, subjects have reported “hearing” sounds that do not match the actual auditory stimulus (Warren, Obusek and Ackroff, 1972). This phenomenon is known as auditory induction, the perceptual synthesis of sound.
Warren (1984) has split auditory induction into two classes: temporal induction and contralateral induction. Contralateral induction occurs when sound presented to one ear determines the sound perceived by (but not presented to) the other ear. The sound heard by one ear alone serves to “fill in” the sound that the other ear would usually expect to hear. Temporal induction is the detection of sounds that are consistent with expected sounds but again are not present. A useful example of temporal induction is the ability to hear someone talking in a loud restaurant even though the speech is constantly interrupted by extraneous sounds. The brain “fills in” the parts of words that are obliterated by the louder, short sounds present in a restaurant. Therefore, we hear complete words and sentences.
Warren (1984) further divides temporal induction into homophonic induction, contextual catenation, and heterophonic induction. In the two types of induction a non-varying tone is interrupted, generally by another tone. In contextual catenation a varying sound (such as music) or speech is interrupted.
In homophonic induction two sounds differing only in loudness are alternately presented to the listener with no interruptions. For example, the listener will believe that the quieter tone is continuously present, even during the interval when only the louder tone is present (Warren, 1984). A simple graphical representation of this is shown in Figure 1.
Figure 1. The bold line indicates the actual auditory stimulus. The dotted line represents the subjects’ perception of the stimulus. They reported the quieter tone was continuous.
In 1972 Warren, Obusek and Ackroff demonstrated this concept with a three tone experiment that used tones of 60, 70, and 80 decibels. Each tone was presented for 300ms followed by the next tone as shown in Figure 2. There was no separation between the tones. The listener reported hearing the 60 decibel tone continuously throughout the session, even when informed that the 60 decibel tone was not present during the louder tone intervals.
Figure 2. Warren, Obusek and Ackroff’s 1972 three tone experiment. Homophonic induction occurred: the subjects perceived the 60 db tone as continuous (indicated by the dotted line) although the actual auditory stimulus was not continuous, as shown by the bold line.
It’s been determined that only two tones are necessary to demonstrate the effect (Warren, 1984), and that the introduction of as little as 50ms of silence between the tones would inhibit auditory induction from occurring (Warren, Obusek and Ackroff, 1972). Warren (1984) concluded that the listener “borrows” a portion of the louder sound to create the quieter sound.
Heterophonic induction is similar to homophonic induction except the interrupting tone will vary in dimensions other than just loudness (Warren, 1984). In 1950 Miller and Licklider reported that a tone alternated with broadband noise was perceived as being continuous, even though it was not present during the noise interval. The researchers interestingly explained the effect as one similar to looking through a picket fence while driving alongside the fence; the landscape on the other side is perceived as continuous despite the interruption of the fence.
In the previous types of temporal induction the interrupted sound was a continuous tone. In contextual catenation the sound is varying and could include music or speech (Warren, 1984). This makes it the more complex and interesting of the temporal inductions.
In an early example of contextual catenation subjects listened to the sentence “The state governors met with their respective legislatures convening in the capital city”. The letter “s” in “legislatures” was edited out and replaced by a cough or tone of the same length (see Figure 3). The subjects reported that all of the speech sounds were present, indicating that their auditory systems restored the obliterated sounds. When informed that a section was missing, none of the listeners were able to identify the correct position of the deleted section (Warren 1970). While anyone familiar with English would have to hear the missing phoneme as an “s” (it is the only way to form an English word), ambiguous words also elicited distinct and correct word completions (Samuel, 1991). However, when the missing portion of the sentence was replaced with silence instead of a cough or tone, induction did not take place (Warren 1970). See Figure 4.
Figure 3. The wavy line represents the portion of the word that was replaced by a noise (loud cough or tone). The missing “s” was heard as clearly as the other sounds actually vocalized.
Figure 4. When silence replaced the portion of the word between the two dashed lines, subjects did not report hearing the missing “s” sound.
While there are many behavioral studies with results showing auditory induction (Samuel, 1991; Warren, 1970; Warren, Obusek and Ackroff, 1972; Warren and Warren, 1970), the neural activity that occurs during this phenomenon is not well understood. There is evidence that suggests macaque monkeys share the experience of illusory sound perception with humans (Miller, Dibble and Hauser, 2001; Petkov, O’Connor and Sutter, 2003). To obtain a better understanding of the neurophysiological basis of illusory sound perception Petkov, O’Connor and Sutter (2007) used an electroencephalogram to record macaque monkey’s neuronal activity under conditions that have been found to cause induction. The neurons in the monkeys’ auditory cortex (A1) showed the same activity when the researchers played a continuous tone and when a tone interrupted by a loud noise was played. However when the tone interruption was a silent gap, the activity in the auditory cortex was different, as the monkeys’ neurons signaled the end of the tone and subsequent start of the tone when it resumed. Based on these results, Petkov et al. (2007) concluded that neuronal activity in the auditory cortex was consistent with an auditory induction model in which the brain "filled in" missing sounds.
The proposed experiment will use human subjects to show the same auditory cortex response during auditory induction using EEG. Brain activity in the auditory cortex, T3 and T4 areas (Nardi, 2007), will be similar when subjects are presented with uninterrupted speech and speech with a segment replaced by a loud noise. There will be a change in brain activity when the speech is interrupted with a silent gap.
Participants There will be a total of 12 subjects (six females, six males) and they will be students at the University of California, Los Angeles. The subjects’ participation in the experiment will be voluntary and they will not be compensated for their time. Subjects will be between the ages of 18 and 24 and will be required to sign a consent form prior to participation in the study. All participants must pass a cursory audiometric screening.
The experiment design will be one-way and within-subjects.The independent variable, speech, will be operationally defined as a 30 second recorded conversation between two individuals in a restaurant setting. This independent variable will have three levels. The first level will be uninterrupted conversation, which means that the audio clip will have an unedited continuous conversation between the two individuals (see Figure 5).
Figure 5. Uninterruped speech condition.The bold dashed line represents uninterrupted conversation for thirty seconds.
The second level will be segment replacement, the audio clip will be edited to replace a one second fragment, occuring between the nineteenth and twentieth second of conversation, with the sound of dishes breaking. There will be no pause between the end of the phoneme preceeding the sound insertion or the beginning of the phoneme following the inserted sound (see Figure 6).
Figure 6. Segment replacement condition. The bold dashed line indicates conversation, while the wavy line signifies the sound of dishes breaking.
The third level will be interrupted conversation, the audio clip will also be edited to remove the same one second portion of conversation that was replaced in the second condition, however the gap in conversation will not be replaced with any sound (there will be a one second period of silence, see Figure 7.)
Figure 7. Interrupted conversation condition. The bold dashed line designates conversation, while the vertical dashed lines represent the beginning and end of the silent gap.
The dependent variable will be electrical activity in the subject’s auditory cortex. The dependent variable will be operationally defined as the pattern of activity in areas T3 and T4 of the subject’s brain, as recorded by electroencephalogram during the three 30 second audio clips. This study will specifically compare the subject’s electroencephalograph during the eighteenth through the twenty-first second of each audio clip.
Due to the large amount of variability between subjects tested using EEG (Nardi, 2007), a within-subjects design seems most appropriate. Nardi noted that this variability is probably a result of different activation thresholds for the same brain region among individuals. This also explains why a particular result is not predicted for all subjects presented with the same stimuli.
Materials and Apparatus Prior to beginning the experiment, each participant will be given an audiometric screening. A tape measure and two pieces of masking tape will be needed to mark 20 feet between experimenter and participant. Once the subject has passed the initial screening, he or she will be seated in a chair at a table. A consent form will be signed and the participant will be fitted with a Lycra EEG cap imbedded with 19 tin electrodes. A cable connects the cap to the EEG. A laptop will record the EEG output and a second laptop with external speakers will be used to play the audio clips.
The audio clips will all be based on the same recording. This recording will consist of two male diners conversing about ordering food at a restaurant. All background noise typically heard at a restaurant will be excluded from this recording, leaving the diner’s conversation as the sole auditory stimulus. The first recording will contain 30 seconds of conversation between the two men at about 60 decibels, which is the average level for normal conversation (Dangerous Decibels, 2008). The second recording will be identical to the first with the exception of the period ranging from the nineteenth second to the twentieth second. This portion of the audio clip will be deleted and a 100 decibel sound (Dangerous Decibels, 2008) of dishes crashing will replace it. The third recording will be the same as the second audio clip except that the sound of the dishes crashing will be absent, leaving a silent gap (0 decibels) in the recording.
Procedure Two pieces of masking tape will be placed on the floor, 20 feet apart. The experimenter will stand at one piece of tape and the subject will face away from the experimenter, standing at the other piece of tape. The experimenter will whisper the subject’s name and the subject will raise his/her hand if his/her name is heard. If the subject does not pass the whisper test, then he or she will not be eligible for the experiment.
The subject will be asked to sit down at a table and sign a consent form. The experimenter will secure the subject’s hair at the nape of the neck, if applicable, and the EEG electrode cap will be placed on the subject’s head. The chest harness will be positioned around the torso and the EEG electrode cap’s chin strap will be secured to it. The experimenter will inject EEG recording gel into each electrode cap access hole with the exception of those on the crest of the head (corresponding to areas FZ, CZ, and PZ.) Each of the remaining 16 electrodes will be tested to ensure proper functionality. The participant will then be asked to relax by closing his/her eyes, slowing his/her breathing, and clearing his/her mind of distractions.
The subject will be instructed to listen attentively to the following audio clips. The experimenter will play one of six groups of audio clips. Each group contains three audio clips, one for each condition, and the order of the audio clips will be unique to each group. This will ensure that the sequence of the audio clips does not confound the results. The group assigned to each subject will be randomized, so that each subject will have an equal chance of being assigned to each group. There will be a 30 second delay between each of the three audio clips. After the subject listens to his/her group of audio clips, the EEG cap and chest harness will be removed and the subject will be thanked for his/her participation.
Dangerous Decibels. (n.d.). Dangerous Decibels: Frequently Asked Questions: What sounds cause noise-induced hearing loss (NIHL). Retrieved January 4, 2008, from http://www.dangerousdecibels.org/faq.cfm#3
McGurk, H., & MacDonald, J. (1976). Hearing lips and seeing voices [Electronic version]. Nature, 264, 746–748.
Samuel, A.G. (1991). A further examination of attentional effects in the phonemic restoration illusion [Electronic version]. The Quarterly Journal of Experimental Psychology A: Human Experimental Psychology, 43A(3), 679-699.
Shams, L., Kamitani, Y., & Shimojo, S. (2000). What you see is what you hear [Electronic version]. Nature, 408(6814), 788.
Shams, L., Kamitani, Y., & Shimojo, S. (2002). A visual illusion induced by sound [Electronic version]. Cognitive Brain Research, 14(1), 147-152.