Authentication of Digital Audio Recordings

Tuesday, November 11th, 2014

digital audio recordingsOne of our day to day activities as audio forensic experts is authenticating digital audio evidence. When one of the parties in a litigation believes that an audio recording was tampered with or edited, an audio forensic expert is brought in to investigate the recording. When we authenticate an audio recording, the first step is to establish chain of custody. While it is the first step, chain of custody does not, in and of itself, establish a recording as being authentic. I have seen audio evidence that was not authentic and was stored in a digital audio recorder. So why is audio authentication so important? What should an audio forensic expert be aware of when examining audio evidence? What is the process of examining and authenticating audio evidence? I am going to answer these questions and more in the following post.

A majority of audio recordings we are hired to authenticate are created on digital audio recorders or smartphones using a recording app. These devices are easily concealed in a pocket or purse. They come in many shapes and sizes. They record various formats. One of the first steps an audio forensic expert must take when authenticating a digital audio recording is to become familiar with the equipment that created the recording

Importance of Authentication

The authentication process determines whether or not the audio recording in question has been tampered with. In this age of digital audio, edits can be made and covered up very easily. There are free versions of audio editing software – such as Audacity – which are available on line and can make edits that alter the events or conversation that originally occurred in digital audio recordings.

In the last 30 days, of all the audio authentication cases I was assigned, I found two had been edited. Both of the recordings were downloaded to a computer, edited, then played back and re-recorded through desktop computer speakers using a digital audio recorder. Most of the time, if an audio recording is edited after downloading to a computer and before authoring a CD, the editing can be detected in the digital recordings meta data. During the forensic authentication process, the software that created the edits will be detected in the HEX information of that edited recording.

If audio evidence is found to be altered, it should be ruled inadmissible in court because it is not an accurate representation of the events that occurred.

So what should the audio forensic expert be aware of during the authentication process?

First, establish and determine the chain of custody. If the expert is able to retrieve the evidence from the original source, in most cases that will automatically create and establish an authentic chain of custody. Or, provide clues of tampering if the recording was edited and re recorded. If it’s not possible for the forensic expert to retrieve the recording, then the forensic expert must carefully go through all of the documents and reports that arrived with the evidence. Sometimes a chain of custody log from law enforcement will be included, which will strengthen the authenticity of the audio evidence. But if the chain of custody cannot be established, the forensic examiner must rely on other techniques as well as their own expertise to determine the authenticity of the evidence. If further investigation reveals more inconsistencies in the recording and metadata, more often than not that recording is determined to be altered.

Digital audio recorders aren’t the only equipment that record audio evidence. CCTV surveillance systems, as well as most other digital video recorders, will include both audio and video in the recordings. As an Audio and Video Forensic Expert, I often work with both the video and audio from these recordings. When I receive digital media evidence that includes sight and sound, I analyze both audio and video using separate forensic processes. I have come across cases in which the video was unedited but the audio had been tampered with. In this case, I was unable to authenticate the evidence because a chain of custody could not be established. Plus, there were anomalies in the audio that could be measured, heard and documented.

Process of Examining and Authenticating Audio Evidence: Critical Listening

One of the first steps that I take when audio evidence arrives at our lab, I listen critically to the entire recording a number of times. During this process I note unusual sounding sections in the recording which are called anomalies. I take notes and place markers using the forensic software so that I can find them later and include them in my forensic report.

These unusual sounding sections can be changes in the background ambience, inconsistent speech pacing and wording as well as changes in the noise floor. The noise floor is a series of natural and electronic sounds that should be consistent throughout the recording. Noise is defined as any sound source signals like hiss, hums, wind, HVAC and other sounds that are not part of the intended recording.

Critical listening must be the first step to become familiar with the audio evidence. If an edit is discovered during the critical listening phase, they are usually in the form of abrupt changes. Detecting these changes is not easy and comes with experience.

It’s important for the forensic expert to put themselves in a quiet, isolated room during critical listening so as to avoid any outside disturbances. The quiet environment enhances the critical listening focus. High quality, professional grade monitoring headphones and high quality studio monitors (speakers) are best for critical listening analysis of digital audio recordings. Professional quality headphones and speakers will have the flattest frequency response, which means they produce neutral and natural sound. This is very important for the forensic expert because subtle boosts and cuts in frequencies can impact the analysis of the digital audio recording.

Sometimes frequencies may be more audible in headphones and sound clearer to the forensic expert while other frequencies may be better heard through speakers. When the forensic expert is examining audio evidence for authentication, it is important to use both headphones and speakers to hear every aspect of the recording.

In some audio evidence I have examined, I have been able to hear a second noise floor in the recording. This usually occurs when a recording is played through speakers or an auxiliary cable into another recorder. The original noise floor from the recording is heard along with the second noise floor created from the second recording.

Electronic Measurement

After critical listening, the forensic expert must use electronic measurement to examine the audio evidence. This is done by noting the prominent frequencies in the voices or other sound source and the noise floor. The levels of the recording and of the different frequencies can be measured as well. Tools such as spectrograms, frequency analysis windows and level meters are very helpful for observing and collecting this information. The expert should note the frequency range of the overall recording, the voices or conversation and the noise floor or extraneous sounds in the recording.

If the frequency range of a voice suddenly becomes larger or smaller or shifts in frequency range, that can be a sign of an edit. Sudden, unexplained changes in the noise floor level as well as the sudden presence of another background noise can also be a sign of an edit. As I mentioned before, I have come across recordings in which I could hear two noise floors. This can often be measured and seen in a spectrogram and a frequency analysis panel.

Visual Inspection

Visually inspecting the audio wave form and spectrogram is the next step in authenticating the audio. This goes hand in hand with the electronic measurement as the forensic expert analyzes the physical wave properties and frequency information. Waveforms are continuous and smooth when examined very closely. Even a quick, loud sound like a clap will have a smooth, continuous wave. If there are sudden breaks in the waveform of a recording, these are signs of editing. The expert should also pay close attention to the phasing of the waveform. This can also been seen when visually zooming in to the waveform. If the waveform of the recording is suddenly inverted, this can also mean an edit was made.

The spectrogram will show the full frequency spectrum with warmer or colder colors representing the strength of that frequency. The noise floor can be seen very clearly in this view, helping to identify breaks in the sound. All recordings have some noise floor, even if they are almost inaudible. When viewing the spectrogram, any breaks in the noise floor may be signs of an edit. Changes in the volume of the noise floor can also be a sign of an edit.

Analyzing Metadata in Digital Audio Evidence

When I first began working as an Audio Forensic Expert, most of my work was with analog audio evidence in the form of mini, micro and standard audio cassettes. I did have some cases where reel to reel tape was used. Today almost all recordings are done digitally, there is additional information that can be analyzed when performing an audio authentication. Digital audio recordings contain metadata which reveals information about how the recording was made and the type of equipment that created the recording. If a recording was loaded into a software program capable of performing edits, there will often be a footprint left in the recording HEX information showing what software was used.

When examining the digital information, it is necessary to create an exemplar recording to compare the metadata with the original. An exemplar is a recording that is made in conditions that are as close to the original recording as possible . The exemplar is made on the same kind of audio recorder and, if possible, the same environment. Using this exemplar, the forensic expert can compare the metadata and HEX information of the two files. If there are inconsistencies in the data, that can also be a sign of tampering.

For a forensic expert to authenticate a piece of audio evidence, the expert must prove beyond any doubt that the recording is in its original form and has not undergone any tampering. If a piece of evidence is not authentic, it should not be used in court because it may be incomplete or altered to purport events that did not occur.

Audio Forensics: An Accurate, Arguable and Authentic Approach to Understanding Audio Evidence

Tuesday, June 4th, 2013

audio forensicsBell Labs was the first to discover that spoken word patterns and sounds could be identified and characteristics examined to identify the individual who made them. This has been a very important advancement in forensic science because the potential to assist law enforcement is well worth the effort it takes to defend the proponents and practitioners. Audio forensics is sometimes referred to by some as a ”junk science.” After over 25 years of examining, editing and clarifying audio recordings, I can attest to and scientifically prove that voice identification and audio authentication comprise an exacting science that has huge benefit to the courts, law enforcement agencies and businesses.

In the following article, I will describe what works and does not work for two of the main activities of audio forensic experts: voice identification and audio authentication. I will also review and break down the steps and processes I employ and explain why I believe audio forensics is a valuable tool in litigation.

I have been retained for dozens of court cases, as well as by corporations, to analyze and help explain various aspects of audio evidence in one form or another. Some situations required that I find the truth about the source of a threatening voice, like a bomb threat called into 911 or a sexually harassing voicemail left on a victim’s phone.

Other cases involved defendants trying to validate or disqualify a pre-recorded audio confession. Evidentiary audio recordings all have one thing in common: they needed an experienced audio forensic expert to review and either qualify (validate) or disqualify the evidence. My job as an audio forensic expert is to determine the recording’s authenticity or to identify the person’s voice.

Voice Identification Overview

I have been practicing voice identification for over 25 years. Many of my skills and principles have been learned from employment as an audio engineer. Other skills I have learned through reading and studying to develop skills and completing successful cases successfully. I believe people’s voices, just like fingerprints, can be identified through visual inspection of sound waves and spectrum analysis, as well as through critical listening skills. I have conducted voice identification for sexual harassment, workers compensation and employment harassment, as well as various threatening voicemail messages like bomb threats.

In our country today, we are guilty until proven innocent, the opposite of what our United States Constitution promises. It is my job to determine the truth about voice recordings using visual, electronic and auditory inspection of, both the evidence recording and an exemplar (voice sample taken for the purpose of comparison).

A typical case I would review might involve a telephoned bomb threat or harassing call that was recorded on audiotape or digital voicemail. After the police arrested a suspect, I would be retained by either the state (court) or defense to determine the truth about that audio recording.

The first step is to examine the original evidence and learn as much about the recording as possible. How was it created? Who created it? What machinery was involved?

Then, with the help of the court or defense lawyer, I create an exemplar of the accused voice to compare visual, electronic and auditory characteristics.

Almost every legal case I have been engaged in has allowed my report and or testimony into evidentiary status to aid with ”due process.” I believe my success rate is high due to the fact that I employ the three testing platforms outlined above.

Steady advances in computer technology have had a huge impact on audio forensic voice identification. Having experience as an acoustic engineer who has listened to literally hundreds of hours of spoken word recordings, in addition to sophisticated electronic software programs, has contributed to my success with voice identification.

One case I examined involved a bomb threat. Bomb threats make up a fairly large segment of voice identification activity. The call in question was made from a pay phone outside of a convenience store to a 911 operator. This was scientifically evident when police traced the call.

The caller identified herself by name as an employee of XYZ Company. When the police arrived at XYZ Company, they found the employee with the name the caller gave the 911 operator and arrested her. The employee denied making the call.

She was charged with making a bomb threat call, guilty until proven innocent. I was retained by the defense to prove that our client did not make the bomb threat call.

Voice Identification Procedure

When comparing spoken word samples for the purpose of identification, I base my processes on historical information I have learned from the scientific community, state police crime labs, other forensic experts and designers and developers of electronic (especially computer) equipment and testing software programs. My process requires the visual, electronic, and auditory examination of every aspect of the words spoken, not just the pathological examination. The words themselves, the way the words flow together, the pauses between the words, the way the words are formed by the mouth and larynx can be measured using three processes. The first process is a visual examination of the sound wave, comparing the evidence and an exemplar (a voice sample of the accused). The second process is an electronic measurement of the evidence, which is then compared to the exemplar. The third process is perhaps the most important: critical listening skills that compare the evidence and the exemplar of how the words are spoken and pronounced. Noise floor and electronic measurement of speech and other audible sounds in the recording must also be considered and measured. Forensic procedure requires careful examination of all audio evidence characteristics, following procedures as outlined by the scientific community.

These scientific procedures begin with the analysis of the quality of the audio recording. It is important to establish that the quality of the recording in question is acceptable and workable. Sometimes, it may be necessary for an audio forensic expert to apply some light equalization or other non-destructive audio processing to reduce or remove background noise that may interfere with the forensic examination.

Voice identification requires the forensic examiner to discover similarities, as well as differences, in all three areas of investigation.

Here are the step-by-step processes I use when conducting voice identification:

1. Visual examination of the original recording, analogue or digital. This includes examination of the physical characteristics of the tape itself (if analogue) or analogue or digital recorder. It is important to examine the cassette tape (standard, mini or micro) or other analogue or digital source to determine if there are visual signs of tampering or alteration.

2. Once the physical evidence has been examined, the next step is to load the recording in question into a forensic computer. Visual examination of the sound wave, sonogram and spectrograph reveal speech characteristics and patterns of verbal delivery as well as electronic characteristics. At this point, the recording has been digitized so forensic software can analyze and conduct various tests.

3. If possible, for authentication or voice identification, an exemplar or comparison recording should be made of the original recording to compare the original recording characteristics. This same forensic examination process that is applied to the evidence is also applied to the exemplar to determine that the characteristics are the same and the recording is from the same audio recorder.

4. When conducting voice identification, it is important to create an exemplar of the accused for audio comparison using as exact conditions and equipment as close as possible to the measurements taken from the evidence as outlined above. The speech must be the same as the speech on the evidence in order for the testing to be accurate. As an audio forensic expert, I often have to coach the accused into the same energetic voice tone and inflection as the evidence recording. However, it is still possible to compare speech if the exemplar is not as close to the evidence as I would like.

5. Critical listening skills are used to examine the speech pattern, pronunciation, voice tone and inflection, accent, dialect and specific speech characteristics (like a lisp or significant ”s” delivery). There is a rhythm in how an individual speaks, and even if s/he is trying to disguise his/her speech (in an attempt to fool the forensic examiner), the rhythm and speech patterns as described above still show through. The expert must pay careful attention to the rhythm of spoken word formations. I listen to single words as well as phrases and sentences. I like to compare original evidence sections of spoken word recordings as well as individual words. This is best accomplished by editing exemplars and original recordings back to back. It is extremely helpful to then make these sub files of words and sentences within the section back to back with exemplars. I repeat the assembly over and over to accommodate critical listening skills with the auditory identification process. That way, your ear can experience the sounds, vowel formations and consonants without interruption.

There are many character traits that can be experienced in a spoken word recording. It is important for the audio forensic expert to become familiar with the evidence speech patterns and visual and electronic characteristics. These characteristics are evident in a person’s voice even if he or she attempts to disguise it and they are compared to the exemplar.

Audio Authentication

Using many of the same tools as described above, audio authentication can help determine the validity of audio evidence that is being considered as evidence in litigation.

When authenticating an audio recording, it is important that the audio forensic expert pay careful attention to tone consistency of the audio recorded signal (speech) as well as the recording’s noise floor.

The consistent audio-recorded signal is important because audio recordings that are not authentic are most always edited or fabricated assemblies of two or more audio recordings for the purpose to deceive the person(s) listening to the recording. Using the tools described above, the audio forensic expert can measure the tone consistency to determine authenticity.

Those same tools can also measure the noise floor looking for inconsistencies in the room tone or background noise of the recording. These breaks or changes in either audio recorded signal or background noise are signs that the audio recording being considered may be counterfeit or fake.

Critical Listening Skills

I have been working with professional speakers and analyzing other spoken word recordings since 1980 and have developed my critical listening skills to a degree that far exceeds the average person’s sound perception. When I first hear audio evidence and add exemplar recordings so I can listen to both back to back, then I apply my critical listening skills to determine the speech similarities as well differences between the two.

In my early days as an audio engineer, I learned to edit reel to reel tape with razor blades to make a recording sound as if it were recorded start to finish without a single mistake. Some of my edits were pretty tricky. I got so good I could split words in two and even three edits to fix a problem or shorten a script. After a while, I became very familiar with speech characteristics and patterns as well as vocal tone and pronunciation.

The best way to become skilled in voice identification is to listen to hundreds of hours of forensic evidence to become familiar with the various speech pathological characteristics and develop critical listening skills.

There can sometimes be differences in speech patterns that can help identify clues. Listen for several similarities as well as differences, such as nasal resonance differences and voice tone with regard to inflection.

Voice Identification Conclusions

When conducting the examination, the audio forensic expert must look for similarities as well as differences in all three testing platforms to help arrive at a conclusion.

After the investigation and testing procedures are complete, the forensic experts report must arrive at one of the following conclusions: positive identification, probable identification, positive elimination, possible elimination or inconclusive.

The key to successful voice identification is to develop a methodology and standard procedure that you strictly follow every time you conduct an identification and comparison.

Audio Authentication Conclusion

Every tone change in either the audio recorded signal or background noise must be documented and analyzed as a whole before considering the recording genuine or authentic. All forensic concerns must be documented and listed in the forensic report to prove the audio forensics findings.

The Audio Forensic Report

It is my belief that the audio forensic report should include:

1. The introduction: What the expert was asked to do and how the expert arrived at their conclusion, including all scientific fact.

2. The testing processes you employed to ex- amine the audio evidence.

3. The expert’s conclusion of the tests, includ- ing the expert’s opinion as to the relevant facts and concerns.

4. The expert’s curriculum vita (resume) to establish credibility as an audio forensic expert, and to accommodate the Federal Court’s protocol for submitting an expert report.

5. A published article authored by the expert concerning the kind of testing relevant to the current case.

The Noise Floor: A Forensic Aid for Audio Authentication and Voice Identification

Tuesday, October 19th, 2010

noise floorAudio authentication and voice identification requires that a forensic expert examine three critical aspects of an audio recording before beginning any forensic process. Whether it be analogue or digital audio recording, an audio forensic expert should inspect the consistent characteristics of the sound wave formations; listen critically to various tones present in the recording, background noise (noise floor) of the audio recording; and examine the electronic spectrograph measurement. These three critical aspects of an audio recording must be consistent throughout the recording to determine authenticity.

An audio forensic expert has been trained by examining hundreds, if not thousands, of hours of audio recordings. This experience helps the forensic examiner to develop a critical listening skill far more precise than the average person’s. That keen sense of sound perception is very important for audio authentication and voice identification.

During the examination process, regardless of analogue or digital audio examination, it is advantageous that the original recording, and recorder, as well as other recording equipment (wireless transmitter, microphone) also be examined. That way, the forensic examiner can recreate the characteristics of the audio recording including signatures (stop-start) and noise floor.

The noise floor is a critical aspect in audio authentication as well as audio identification because it provides the forensic examiner a second dimension of sound to examine and authenticate other than the main recorded signals (speech, gunshot and voice mail).

Alterations in an audio recording, analogue or digital, most likely will be first detected by a change in the noise floor of the audio recording followed by an anomaly that can be heard auditorially, measured electronically and viewed on the computer screen by examining the wave form.

Part of this noise floor is the background noise of the recording. It is the sounds present on the audio recording that the author had not necessarily intended to have recorded but is still part of the recording that is helpful to a forensic examiner.

Both analogue and digital audio recordings have background ambient noise, the noise floor, when the speech or other audio recorded is not present. This background noise speaks volumes on whether the audio recording being examined is original, authentic or has been altered or edited in addition to the examination outcome of the main recorded signals.

Clarification of Audio Recordings for Authentication

Thursday, October 14th, 2010

digital audio recordingAll recordings–both digital and analogue–have a noise floor. The term originated when manufacturers of analogue audio recorders referred to the extraneous noise that their machine created in addition to the desired recorded audio signal.

Often a background noise constitutes most of the audio recording and covers a portion of speech that needs to be audible in order to determine a series of events pertinent to the case. These noises can often be removed by the audio forensic expert to help determine facts about the series of recorded events.

Background noise and noise floor extraneous sound can consist of a heating or air conditioning fan running, refrigerator motor, window fan, clock, fluorescent lighting, wind, rain, car running and even radio or television. All these sounds contribute to the background noise and noise floor of a recording and aid the forensic examiner in authenticating a recording. However, this background noise can interfere with the forensic examination. Clarification is part of the forensic examiner’s job. It is appropriate for the forensic examiner to remove these background sounds in order to authenticate or clarify an exhibit of audio recorded evidence.

Some of the recordings experts are asked to authenticate are confession recordings created by law enforcement agencies. Defendants exclaim, “That is not what I said, they edited it” or “There is more I said that has been edited out of the recording.” Due process entitles both parties in litigation to examine any evidence presented in their case. However, original recordings are not always available for examination. How do you as a law enforcement official feel about the absence of original recordings?

I have worked on cases where missing “original evidence” was considered spoliation of evidence. Personally I believe that circumstances of each case should be considered by the forensic examiner before any decision has been made by either party.

If the forensic examiner observes characteristics that are noticeably questionable, then the expert must notify the officials in charge of their findings during the preliminary examination phase of the forensic investigation. Original recordings are required, and if not produced, a motion to suppress the evidence should be filed.


Audio Forensic Expert Demonstrating Audio Authentication

Friday, October 8th, 2010

If you ever wonder what an audio forensic expert does, here is a video of one of our activities: audio authentication.