Since I began my career as an audio forensic expert 28 years ago, I have been retained on two drug trafficking cases where the US prosecutors office was trying to convict based on five words spoken during a confidential informant recording.
I have been on both sides of the fence working for the state as well as for the defense. The scientific community, specifically the American Board of Recorded Evidence, explicitly states that in order for an audio forensic expert to deliver a positive identification, “At least 90% of all comparable words must be very similar aurally and spectrally, producing not less than twenty (20) matching words. The voice samples must not be more than six (6) years apart.”
Voice identification is both an art and a science. It is an art in that the forensic examiner has to conform best practices to each specific, unique case. Not all voice identification cases are the same. In fact, I have had related situations but every case has still been unique.
I have identified singers, good guys and bad guys. I have testified in cases where my testimony was crucial in the outcome of the case. One thing remains certain: a government body cannot convict because they believe a specific person said the phrase that their case is built around.
For example, imagine I am with three other friends and after a confidential informant approaches me inquiring about a crime we are plotting, one of us in the group says the “phrase that pays” in order for the CI to complete the sting. Unless that CI’s recording has a 20 or more word “phrase that pays” it’s technically their word against the defendants about who said the convicting phrase.
In other words, the difference between first degree murder and the death penalty vs. a lesser judgment could be that phrase spoken during the recording and the person who spoke that phrase on the CI recording.
An audio forensic expert has the ability to identify that voice by creating an exemplar and comparing at least 20 words, no fewer, spoken by the accused. I recently had a case where the state was going for a more severe conviction based on five words spoken during a CI recording. However, it is not possible to deliver a positive identification based on so few words.