One of the most important elements of voice identification is the ability to recognize the characteristics of the human voice. There are many elements to distinguish these characteristics, some audial, some visual.
Think about when you have your back to a person who enters the room and says hello. If it is your child, spouse, or co-worker, I bet you recognize them immediately because you are familiar with their voice. This is the starting point for voice identification; becoming familiar with the characteristics of the unknown voice.
I began editing spoken word on reel-to-reel tape with razor blades and splicing tape. I had to learn to visualize the words in my mind’s eye in order to cut the tape in the right place. Today, we have software programs that display the waveform and sound spectrum of the spoken words, which make the editing process more accommodating. The editor can see the way the words look on the computer screen while deciding where to make the edit and connect the sentences, removing the stutters, coughs, gaps, and mistakes.
During the editing process, you will learn to listen for voice characteristics almost subconsciously. These characteristics include the way the words are spoken, the word pronunciations, vowel and consonant pronunciations, the recording noise floor (unwanted background noise), the way the words flow together, and significant patterns of speech you may detect, like accent, dialect and impediments, nasal cavity resonance, voice tone, and inflection and speech pacing.
Pay attention to both differences and similarities from recording to recording, and take notes on your observations building a speech database for when writing the report.
Exemplars are defined as expert supervised audio recordings of predetermined spoken word samples for the purpose of voice identification comparison. During the exemplar creation process, it is important to coach the person (subject) speaking for the recording into the same level of energy as the evidence recording of the unknown voice. Listen to the energy and attitude of the voice you are examining (evidence or unknown recording). Do you hear a mood or psychological characteristic in the voice?
In some bomb threat recordings I have examined, the speakers have an angry, sad, or depressed attitude in their voice while speaking the recorded words. It is important to note that at the time of creating an exemplar, the subject is often not in the same psychological state as the individual in the unknown recording. While making the exemplar, do your best to coach the person (suspect) to speak with the same energy as the voice on the evidence recording.
Your critical listening ear will help you complete this process to the best of your (and their) ability. You have to listen critically beyond the subject’s current mood because it is often difficult to coach them into the mood of the person on the evidence recording. Listen for specific speech characteristics in the exemplar and evidence recordings. What do you notice about the unknown voice that is characteristic of the known voice?
To practice, spend some time listening to spoken word recordings. These can be in the form of talk radio, podcasts, and audiobooks. Write down speaking characteristics of the voice recordings like this:
Medium pitch, low pitch, high pitch
Does the subject have a characteristic rhythm to his speech or a pattern of delivering words and pausing?
Listen to several spoken word recordings and make a list of speech characteristics. Take notes on your observations.
Only through practice and experience will you become familiar with voice identification. When creating a new audio comp or assembly file in Sony Sound Forge or Adobe Audition, you will be able to listen to the speech sections that you are comparing repeatedly and with easy access. Back-to-back critical listening is an extremely important tool for voice identification. It is the best way to develop your critical listening skills and begin to recognize the different speaking characteristics of each voice examined. The familiar and unfamiliar speaking samples can be identified, and characteristics can be easily noted.