BLOG

Sir Paul McCartney is NOT Dead- Actual Science

March 20th, 2015

So, you may have heard the rumor that Sir Paul McCartney is dead. In fact, the evidence that has been presented over the years is quite entertaining. The ‘Abbey Road’ album cover with Paul barefoot and songs like Strawberry Fields Forever or Revolution 9, that are purported to state messages regarding Paul passing, have all been linked to McCartney’s alleged death.

The rumor is that Sir Paul McCartney died in a car crash and was replaced with a man named Billy Shears who won the Paul McCartney lookalike contest. I am writing this blog post today to end the years of rumors with actual science: voice identification testing. The full voice identification report can be viewed below.

This assignment began when I received a call from Paul DuBay, a Beatles fan from San Antonio, Texas. He retained me to conduct this voice identification test because he wanted to know the truth.

The goal of voice identification testing and speaker recognition is to compare the known and unknown voices using critical listening, electronic measurement, and visual inspection of sound wave formation and spectrogram. The software programs I used for this voice identification test include Adobe Audition, Sony Sound Forge and Easy Voice Biometrics. Biometric technology is used as a secondary voice identification and speech recognition tool.

I used various songs performed and recorded before and after 1966 that feature the voice of Sir Paul McCartney. I was asked to compare songs from both of these time frames (pre and post 1966) to determine if the current Sir Paul McCartney is the same person as the pre-‘Paul is Dead’ (PID) Paul McCartney.

 


 

Screen Shot 2015-03-20 at 11.28.54 AM26 January 2015

 

Dear Mr. DuBay,

I am an audio and video forensic expert and have been practicing for over 30 years. I have testified in several courts (See updated CV attached) throughout the United States and worked on various international cases. My forensic practices for audio investigation include digital and analogue audio authentication, restoration and voice identification. As a video forensic expert, my practices include video authentication, restoration and identification. (Audio is mentioned first – note to Paul)

I received from you the following digital audio music files:

Song Title/Music File Name:                                                                        LP Title:

  • ‘1963 I Saw Her Standing There.m4a’                                         Please Please Me
  • ‘2002 I Saw Her Standing There(Live).m4a’                             Back In the U.S
  • ‘1965 I’ve Just Seen a Face.m4a’                                                    Help!
  • ‘1976 I’ve Just Seen a Face (Live).m4a’                                       Wings Over America
  • ‘1964 Kansas City.m4a’                                                                       Beatles For Sale
  • ‘1988 Kansas City.m4a’                                                                       CHOBA
  • ‘1967 Sgt Pepper’s Lonely HCB.mp3’                                          Sgt Pepper
  • ‘1964 Long Tall Sally.m4a’                                                                  Beatles Second Album
  • ‘Too Many People.m4a’                                                                       Ram

These songs were “before and after” samples from either the pre-publicity of Paul McCartney’s alleged death in 1966 or the post-‘Paul is Dead’ (PID) news story.

I am familiar with the theories that Paul McCartney was killed in a car accident in 1966. This is why you contacted me and asked that I perform voice identification and speaker recognition testing on various songs performed and recorded before and after 1966. You asked that I compare these songs from both of these time frames (pre and post 1966) to determine if the current Sir Paul McCartney is the same person as the pre-‘Paul is Dead’ (PID) Paul McCartney.

The goal of a voice identification test and speaker recognition is to compare the known and unknown voices using critical listening, electronic measurement, and visual inspection of sound wave formation and spectrogram. The software programs I used for this voice identification test include Adobe Audition, Sony Sound Forge and Easy Voice Biometrics. Biometric technology is used as a secondary voice identification and speech recognition tool.

I help my clients understand voice identification testing and speech recognition by using familiar voice examples. If you are in your office and a fellow employee comes into the office that you have worked with for many years and says ‘hello’, you recognize that voice without making eye contact. The same is true when you are at home and your spouse or even a relative comes in and says hello and begins talking to you. You know who the voice is before making eye contact because you are familiar with the voice. This is how critical listening examination is conducted during voice identification tests. I have been performing voice identification testing most of my career as an audio forensic expert. In fact, throughout the course of my career I have performed dozens of successful voice identification tests including a test for CNN on the voice of Apple’s ‘Siri’.

When beginning a voice identification test, I first become extremely familiar with both the known and unknown voices and list all similarities as well as differences during this repeated listening. In the case of Sir Paul McCartney, I listened to all songs repeatedly during this critical listening phase. I also measured and viewed the sound spectrum and wave formation repeatedly to arrive at my professional conclusion.

The following report will include descriptions of the similarities observed during critical listening, electronic measurement, visual inspection as well as biometric testing. I have not observed any differences in any of the voice samples tested.

I began by downloading the digital audio files that you sent onto my forensic computer then opened all using Adobe Audition CS 5.5. Next, I began critical listening to all of the vocal samples multiple times to become extremely familiar with all voice samples presented.

I focused first on the two samples of ‘I Saw Her Standing There’ as they were superior audio samples that also include a vocal number count at the beginning. Next, I focused on ‘Long Tall Sally’ and Sergeant Pepper’s Lonely Hearts Club Band’ as their vocal delivery are extremely similar and unique.

The beginning count (1-2-3-4) of the ‘1963 I Saw Her Standing There.m4a’ version and the beginning count (1-2-3-4) from the ‘2002 Live Version of I Saw Her Standing There’ is identical which indicates the rhythm of Sir Paul’s internal metronome is the same. The vocal range and phrasing in both samples is also the same. The slight difference in vocal tone is attributed to the age difference of Sir Paul as he has matured over the years and so has his voice.

The spectrogram image below shows exact frequency spectrum in spite of the difference in years:

Screen Shot 2015-03-20 at 11.17.00 AM

In the above image, the spectral frequency display is shown in the lower half of the image in cloro. The yellow, brighter colors indicate the stronger higher volume frequencies present in that portion of the audio while purple and black colors represent frequencies that are weaker or lower in volume. The audio on the left side of the image is the original recording of the song from 1963 and the portion on the right is the more recent recording from 2002. These can be heard in the comparison audio work product attached to this report.

When closely examining the formant frequencies shown in the spectral display above, it is noted that they are nearly identical. Formants are resonances or spectral created by a human voice. These are the frequencies that have the highest presence in a person’s voice and determine most of the tonal qualities of that individual voice. Because the formants in both recordings are almost identical, I conclude beyond a reasonable degree of scientific certainty that they are the same voice, Sir Paul McCartney. The slight variations can be explained by age difference of the voice between the recordings.

The next spectrogram image below shows the sample ‘Wooo’ from the chorus of ‘I Saw Her Standing There’. The original recording from 1963 is displayed on the left and the newer recording from 2002 is shown on the right. Note the fundamental frequencies that are circled in blue in both recordings. Through close examination (narrow band spectrum analysis), it is clear that the fundamental frequencies, harmonics and the range of the frequencies in the early recording and more recent recording are identical. Critical listening also revealed no differences between the two samples, which can also be heard in the comparison audio work product attached to this report. Therefore, through both visual inspection and critical listening, I have determined that the voice in each sample of ‘I Saw Her Standing There’ is the same voice and that of Sir Paul McCartney.

Paul McCartney

I continued investigating all songs submitted for voice identification testing and found similar results and arrived at the same conclusion. All vocals from all songs submitted are that of Sir Paul McCartney.

My next comparison was between the ‘1964 Kansas City’ Recording and the ‘1988 Kansas City’ recording. This is shown in the spectrogram image below:

Paul McCartney

The sections that I chose to test are circled. They are the words ‘Kansas City’. The recording to the left is the recording of the song from 1964 and the recording of the song on the right is from 1988. Through close visual inspection of the prominent frequencies in the words ‘Kansas City’, I found that both the fundamental frequencies and the frequency ranges are again nearly identical. I also used critical listening to further support my findings and have determined that the voice from the 1964 recording ‘Kansas City’ is identical to the voice in the 1988 recording. The vocal expression, pronunciation of the words and voice range are an exact match. I continue to conclude that all vocals from all songs submitted are that of Sir Paul McCartney.

The voice tones of all songs examined old and new, are extremely close and often identical when listening critically and viewing the narrow band waveform and frequency spectrum. Songs that were recorded farther apart in time have some small differences, which can be explained by Sir Paul’s difference in age when they were recorded. As people age their voice changes and so does their body. Vocal chords mature and usually grow deeper. Think of a boy going through puberty, the vocal chords mature and so does the voice. The same applies to people who enter their later years. Even though there are these slight differences, fundamental parts of the voice always remain the same.

I believe and will prove scientifically that a person’s singing voice is as unique if not more unique than their speaking voice. In the following paragraphs I will compare the Pre 1966 Paul singing style with the post 1966 Paul singing style by critical analysis of Sir Paul’s vocal timber and very loud and distinct voice. I have made observations while critically listening to upper register, near falsetto, voice signatures for the songs Long Tall Sally and Sgt Pepper’s Lonely Hearts Club Band. I will use the studio recorded versions of both songs, however, I have chosen the deconstructing Sgt Pepper isolated vocal to compare to the Long Tall Sally vocal. This back to back comparison file will be available in my audio work product attached to this report.

In Long Tall Sally, pre 1966, Sir Paul’s voice is very forceful and distinct. His vocal range and style of delivery exact. His O’s in the lyric ‘OOOUU Baby- some fun tonight’ match his vocal style in the first verse of Sgt Pepper’s Lonely Hearts Club Band.

Furthermore, listening to the 1973 studio recording of ‘Too Many People’ during the intro near falsetto adlib ‘piss off cake ay ay ay…ooooh’ at the beginning of the song, I clearly hear the exact same falsetto vocal Sir Paul delivered during his entire career with the Beatles and solo. It is an extremely unique style of singing that can only be produced by the real Sir Paul McCartney. See audio work product attached to this report. All vocals from all songs submitted are that of Sir Paul McCartney.

In the image below, Sgt Pepper is on the left and Long Tall Sally is on the right. Notice the sound spectrum ‘fingerprints’ between each vocal sample are nearly identical in display. Considering that these are different songs, this is a very significant identifier that both vocals are sung by Sir Paul McCartney.

Paul McCartney

I have also loaded isolated segments from ‘Sgt Pepper’ and ‘Long Tall Sally’ into a voice biometrics software program which is capable of taking unique but different voice samples and comparing them biometrically resulting in a percentage of certainty for identification.

I loaded isolated sections of Sir Paul from the beginning of ‘Long Tall Sally’ and the isolated first verse of Sir Paul’s vocal from ‘Sgt Pepper’s Lonely Hearts Club Band’ into Easy Voice Biometrics. The test resulted in a 53 % match. I believe the percentage rate is high enough to confirm a positive identification even though by biometric software standards we would like to see a higher percentage of certainty. In my opinion, this is due to the different words being sang and measured in two different songs. See screen shot of EVB test result below:

Paul McCartney

The biometric test was done as a secondary test to determine the voice similarities using another voice identification and speaker recognition testing process. Critical listening is the primary voice identification tool that I used to arrive at my conclusion (please see and hear audio work product attached to this report).

Conclusion

Listening to people like Dick Clark and Ringo Starr speaking as well as singing through the years, you can hear how their voices have matured yet are identifiable as being from the same person. This maturity fact is why I point out that Sir Paul McCartney’s voice has also matured. Sir Paul has an incredible voice that is extremely unique and, based on my 31 years experience as an audio forensic expert and scientific forensic testing; there is no other voice in the world that comes close to sounding the same or measures spectrographically the same as Paul McCartney.

Through careful analysis of the waveform and spectrogram as well as critical listening and biometric measurement, I conclude beyond a reasonable degree of scientific certainty that the voice heard in all of the song samples examined is of Sir Paul McCartney. This voice identification test confirms the rumor that Paul is Dead is not true.

This concludes this voice identification testing.

Respectfully submitted,

Edward J. Primeau, CCI, CFC


 

How to Set Up a Microphone for CCTV Systems

March 10th, 2015

cctvClosed circuit television systems have become a major contributor of evidence to court cases. While the video from these systems is often very important, the audio can often play just as much of a role in the investigation. At Primeau Forensics, we often are hired to enhance not only the video from a CCTV system, but the audio as well. Clients typically hire us for enhancements because the original CCTV system was not set up properly and was capturing less than ideal quality audio and video. Many times, the audio is more valuable than the video because of what was said during the event. While enhancing the audio is possible, setting the microphone on the system correctly can be extremely beneficial when an incident does occur. Getting a good, clean signal from a microphone relies on two key principles: microphone gain and microphone placement.

Setting a proper gain structure for a microphone will always yield the best result for any kind of recording. Gain is applied to microphones because microphones have inherently low levels. A preamp is used to amplify the signal before it is recorded into a system. When setting the gain, the goal is to get a high enough level that the signal is audible, while also making sure that the level does not clip the system or preamp. Clipping means that the signal has exceeded the capabilities of the system and begins to distort. This distortion hurts the quality of the audio and can make it very difficult to understand what people are saying in a recording.

Gain structure is often set based on the input signal, which makes setting a surveillance system microphone difficult. The input signal of a surveillance system is always changing and cannot be manually reset whenever people enter or leave the area. When setting the gain for a surveillance system microphone, it is usually a good idea to test different levels of sound in the room. Having someone talk or even yell in the room while you set the level can ensure that the recording will not clip when it is recording later on.

We recently recovered some surveillance video evidence that required an audio enhancement. When we received the audio, we found that the gain had been set too high and the entire recording had clipped. We also found that because the room was small and filled entirely with hard surfaces, there was a buildup of reverberant sound. Reverb consists of reflections of sound off of surrounding surfaces. Some reverb is always present, but too much can begin to cover up the direct sound. Direct sound is the original sound coming directly from the source, such as a person speaking. In this case, the gain should have been set much lower on the microphone. This would have produced a much cleaner and more audible recording. Reverb is a more difficult issue to combat and relies much more on the microphone placement.

Different microphones will have different pickup patterns, which means that they will pick up sound in different directional patterns. Typical microphones used for surveillance systems are either cardioid or omni-directional microphones. Cardioid microphones pick up sound from one direction and reject sound from the opposite direction. Omni-directional microphones pick up sound from all directions. Knowing what kind of microphone your system is using is the first step to setting up a microphone. If you are using a directional microphone, it should be aimed at the area where the sound sources will be. Omni-directional microphones are easier to set up because they do not need to be aimed in any direction.

When placing the microphone, it is important to be aware of other extraneous noises in the room. We often see CCTV systems placed near the ceiling and in corners so they can obtain the best view point of the area. This is not always the best location for the microphone depending on how the room is designed. If an air vent or another electrical device is near the microphone, they will add a large amount of noise to the recording and can cover up desired sound. If a directional microphone is being used, try aiming the rejection end of the microphone at the unwanted sound source. This means the least amount of the unwanted signal will be picked up.

Reverb can also be an issue in smaller spaces that have no absorptive surfaces. When the direct sound is buried by the reverb, it can make the desired signal muddy and undefined. Acoustic treatment can be added to a room to deaden the amount of reverb, but this is more often an approach for musical spaces. A typical fix for a surveillance system can be placing the microphone closer to the desired sound. Placing the microphone in the center of the ceiling instead of in a corner could cause the microphone to pick up more of the direct sound, resulting in better and clearer audio. Because sound and reverb tends to build up more in corners, placing the microphone away from the corner will also prevent it from picking up those extra reflections.

Audio evidence is a very prominent part of many investigations and court cases. Setting up a microphone properly for a surveillance system can often make a huge impact on whether that audio can be used as evidence or not. As an audio forensic expert, I come across many audio recordings that could have been much more audible if the system had been set up properly. When installing a surveillance system, setting the gain for the microphone and placing the microphone properly will always improve the quality of the recorded audio.

Blindspot – How to Make Digital Audio Recordings for Evidence

February 19th, 2015

digital audio evidenceAs an Audio Forensic Expert, my day to day activity includes forensic services like audio enhancement and authentication, as well as voice identification. Audio enhancement is probably the most common service I provide, because more often than not, the audio evidence was not recorded in the best way possible. Audio evidence can often be one of the most important pieces of evidence for a case, so it should always be given a great deal of attention.

One of the most common ways people create digital audio evidence is by using digital audio recorders. Law enforcement will often use them for interrogations and confessions, and sometimes even out in the field as a backup for their dash cam or body cam. People outside of law enforcement use them for creating audio evidence as well.

I would like to mention that concealed audio recordings are not always legal. Federal law states that creating an audio recording only requires one person’s consent, but some states follow a ‘two-party consent’ law. This law means that all parties who are on the recording must give permission to the person recording in order for it to be used as evidence in court. I highly suggest looking in to your own state’s laws regarding concealed audio recordings before making one.

When creating a digital audio recording that is going to be used in court, there are many things one should be aware of before making the actual recording. The biggest issue I usually come across is low recording levels. While it is possible to increase the signal level afterwards through forensic audio enhancement, this is unnecessary time and money spent. This will also increase the noise floor of the recording, which can make it more difficult to hear what is happening in the recording. Creating a clean and audible original recording can make the enhancement process much easier and can often make the evidence much more useable in court.

When preparing to make an audio recording, regardless of whether it is a concealed recording or an interrogation recording, the user should always look at the settings of the digital audio recorder.

Two major settings determine the quality of a digital audio recording: sample rate and bit depth. Together, these settings also determine the bit rate of a recording. Changing these settings will affect both the quality of the audio recording and the amount of space used on the digital recorder. When creating digital audio evidence, it is necessary to balance these two in order to get a high quality recording while optimizing the amount of space on the digital recorder. Thankfully, many digital audio recorders will record in lossy compressed formats like MP3 files, which take up much less space and don’t sacrifice a lot of quality.

When recording digital audio in an MP3 file format, the two key settings to pay attention to are the sample rate and the bit rate. The sample rate will ultimately determine the range of frequencies the recorder picks up. At least two samples are needed to record any frequency, which means the sample rate must be twice as high as the highest frequency you need to record.

The range of human hearing is roughly between 20Hz and 20kHz. Typical audio recordings are done at 44.1kHz to capture the full range of human hearing. While this is standard for music and other professional recordings, it is not always necessary for audio evidence.

Most fundamental frequencies of the voice are between 100 and 500Hz with some of the most important harmonic content between 1kHz and 4kHz. This means that a sample rate as low as 8kHz can sometimes be adequate for recording a conversation, which will also save a large amount of space on the digital recorder.

Bit rate determines the amount of bits that are processed per second, which determines the fidelity of the audio. Typical MP3 files are recorded between 192kbps (kilobits per second) and 320kbps, but they can be as low as 32kbps. Just like with the sample rate, a higher bit rate means a higher quality of audio but also a larger file size. The issue that arises with low bit rates is that the compression process applied to the file can start creating digital noise in the recording. This digital noise can often cover up parts of the recording and once it is there, it is very difficult to remove.

When determining what settings to use on a digital recorder, it is always a good idea to make multiple test recordings before making an audio recording that will be used as evidence. These test recordings will let you try out the various settings and then listen back to see what sounds best and what fits your needs the most.

Another setting that is sometimes included on digital recorders is the ‘voice activation’ setting. This setting will start and stop the recording based on the amount of signal the microphone is picking up. While it can be a good way to save space on the recorder, it is not recommended that this setting be used when creating any kind of digital audio evidence. If this setting is on, the digital recorder could stop recording at a key moment in the conversation and miss a crucial piece of evidence. If extra space is needed on the digital recorder, adjusting the quality settings is a much better way to go. Recording all of the content at a slightly lower quality is a lot safer than relying on the ‘voice activation’ setting and missing important content.

Monitoring the battery life on the digital recorder is another very important thing to keep in mind. In some applications, like recording an interrogation, the digital audio recorder can simply be plugged into the wall so it will not run out of power. In other cases where you do not have this option, make sure the battery is fully charged or you have put in new, good quality batteries. Keeping extra batteries with you is also good practice, just in case the recorder does run out of battery and needs a replacement.

When creating the actual recording, try to be as close as possible to the person being recorded. As I mentioned before, one of the biggest issues with audio evidence is a low volume or record signal level. The farther away from the source the microphone is, the lower the signal level and the lower the signal to noise ratio. This means that less of the desired signal and more of the unwanted background noise will be recorded. Background noise can include any extraneous sounds such as furnaces, refrigerators, air conditioners, televisions or even the internal sound created by the digital recorder itself. These sounds can detract from the quality of the recording and often make the desired signals unintelligible.

Placing the digital recorder in a good location is key for making a good digital audio recording. Keep a few things in mind when making your recording. First, the microphone should always be aimed at the subject that you are recording. When placing the recorder in a pocket or a purse, aim the microphone towards the subject. Also make sure that the digital recorder is relatively stable in its location, because any movement of the recorder will be picked up by the microphone and can cover up other parts of the recording. Pay attention to any materials that may be in between the microphone and the sound source; the thicker the material, the more damping there will be on the signal, which will decrease the record level.

Many digital audio recorders have a microphone input which allows you to use an external microphone. The external microphone is always the best option to use if the recorder is going to be placed inside something. When using this option, it is always a good idea to use a high quality external microphone.

There are many different types of microphones that will work better for different situations. Lavaliere microphones are extremely helpful because they are small and usually omnidirectional. This means that they will pick up sounds from all directions and they can be placed anywhere on your person while the digital recorder stays in your purse or pocket. Other microphones, such as directional microphones, may work better during police interrogations because the subject will not be moving during the recording.

As I mentioned before, always create a test recording before making the recording that will be used as evidence. Testing different microphones, microphone placements and locations will help you learn how your digital recorder works and responds to different environments. If possible, try conducting the test recording in the same place that you will create the real audio evidence so you can prepare for any extraneous background noises and other obstacles. After making the test recordings, listen back so that you can make sure the desired sounds can be heard and the sound quality is high enough.

 

Voice Identification: Characteristics of an Unknown Voice

January 12th, 2015

voice identificationOne of the most important elements of Voice Identification is the ability to recognize the characteristics of the human voice. There are many elements to distinguish these characteristics, some audial, some visual.

Think about when you have your back to a person who enters the room and says hello. If it is your child, spouse or co-worker, I bet you recognize them immediately because you are familiar with their voice. This is the starting point for voice identification; becoming familiar with the characteristics of the unknown voice.

I began editing spoken word on reel-to-reel tape with razor blades and splicing tape. I had to learn to visualize the words in my mind’s eye in order to cut the tape in the right place. Today, we have software programs that display the waveform and sound spectrum of the spoken words, which make the editing process more accommodating. The editor can see the
way the words look on the computer screen while deciding where to make the edit and connect the sentences, removing the stutters, coughs, gaps and mistakes.

During the editing process, you will learn to listen for voice characteristics almost subconsciously. These characteristics include the way the words are spoken, the word pronunciations, vowel and consonant pronunciations, the recording noise floor (unwanted background noise), the way the words flow together, and significant patterns of speech you may detect, like accent, dialect and impediments, nasal cavity resonance, voice tone and inflection and speech pacing.

Pay attention to both differences and similarities from recording to recording, and take notes on your observations building a speech database for when writing the report.

Exemplars are defined as expert supervised audio recordings of predetermined spoken word samples for the purpose of voice identification comparison. During the exemplar creation process it is important to coach the person (subject) speaking for the recording into the same level of energy as the evidence recording of the unknown voice. Listen to the energy and attitude of the voice you are examining (evidence or unknown recording). Do you hear a mood or psychological characteristic in the voice?

In some bomb threat recordings I have examined, the speakers have an angry, sad or depressed attitude in their voice while speaking the recorded words. It is important to note that at the time of creating an exemplar, the subject is often not in the same psychological state as the individual in the unknown recording. While making the exemplar, do your best to coach the person (suspect) to speak with the same energy as the voice on the evidence recording.

Your critical listening ear will help you complete this process to the best of your (and their) ability. You have to listen critically beyond the subject’s current mood, because it is often difficult to coach them into the mood of the person on the evidence recording. Listen for specific speech characteristics in the exemplar and evidence recordings. What do you notice about the unknown voice that is characteristic of the known voice?

To practice, spend some time listening to spoken word recordings. These can be in the form of talk radio, podcasts and audio books. Write down speaking characteristics of the voice recordings like this:

• English accent


• Southern accent


• Consistent sibilant “s”


• Consistent long “a”


• Medium pitch, low pitch, high pitch

• Emphasis on “al” as in “halp” instead of “help”

• Does the subject have a characteristic rhythm to his speech or a pattern of delivering words and pausing?

Listen to several spoken word recordings and make a list of speech characteristics. Take notes on your observations.

Only through practice and experience will you become familiar with voice identification. When creating a new audio comp or assembly file in Sony Sound Forge or Adobe Audition, you will be able to listen to the speech sections that you are comparing repeatedly and with easy access. Back-to-back critical listening is an extremely important tool for voice identification. It is the best way to develop your critical listening skills and begin to recognize the different speaking characteristics of each voice examined. The familiar and unfamiliar speaking samples can be identified and characteristics can be easily noted.

Learn more about Voice Identification and Critical Listening in Forensic Expert Ed Primeau’s new book, That’s Not My Voice! available now on Amazon.

Knowing Your Digital Audio Recorder

December 18th, 2014

Digital Audio RecorderWith digital audio recorders, there are a lot of options when it comes to the quality of the audio recording. Despite the easy access to these options, they are often overlooked. People are either unaware of these settings, or simply forget to check them when they begin a recording. While most settings on a digital recorder will yield a good enough quality recording, I have come across digital recorders with very low quality settings that could result in very distorted or unintelligible recordings. If you are using a digital audio recorder, it is important to have a basic understanding of what contributes to the quality of your audio recording.

Two major settings to be aware of are the sample rate and the bit depth of your recording. The sample rate determines how often a sample is taken from an incoming waveform. The bit depth determines the number of bits for each one of these samples. Together, these settings and the number of channels will determine what the bitrate is. The bitrate is how many bits are processed per a period of time. Bitrate plays a bigger part in lossy audio files.

Sample Rate

There are a few standard sample rates used in most recorders, often including 44.1kHz, 48kHz, and 96kHz. Audio is usually recorded at 44.kHz to capture the full range of human hearing. An audio waveform has a positive and negative pressure area; therefore a minimum of two samples must be taken from a frequency to reproduce it. The range of human hearing is generally given as 20Hz to 20kHz, though it can vary depending on the person. With a sample rate of 44.1kHz, frequencies as high as 22kHz can be recorded, which more than covers the average person’s hearing range. Higher frequency ranges such as 96kHz are used to capture twice as many samples and therefore create a higher quality recording, though most would argue that it is almost impossible to hear any quality difference unless using professional audio equipment.

Bit Depth

The bit depth, as mentioned, determines the resolution of each sample that is taken. A 16 or 24 bit setting is most commonly used; depending on what medium is being used. Audio CD’s, for example, only use 16-bit audio. The bit depth will determine the signal to noise ratio of a recording depending on a logarithmic formula. The signal to noise ratio is the comparison of the desired signal to background and internal noise. A 16-bit recording will have a 96dB signal to noise ratio, while a 24-bit recording will have a 144dB ratio. While 24-bit does have a higher SNR, the 96dB range of a 16-bit recording is often more than enough to create a good quality recording.

Bitrate

When using a format such as an MP3, bit depth no longer applies because of the lossy compression format. This is when bitrate becomes a more important factor of a recording. The bitrate is the number of bits processed in an amount of time, typically written in kilobits per second. The bitrate of an uncompressed audio file, such as a .WAV file, can be determined from the bit depth, sample rate, and number of channels. A CD with 44.1kHz, 16-bit stereo audio has a bitrate of 1411kbps. MP3 and other lossy audio files typically have much lower bitrates, which is why they are so much smaller than uncompressed formats. They achieve this through perceptual coding, which essentially removes parts of the data that are found to be unnecessary and unperceivable by the human ear. Typical MP3 music files have bitrates between 192kbps and 320kbps in order to maintain good quality. Digital recorders that record lossy formats will often have optional bitrates as low as 32kbps.

When choosing what settings to use for a recording, it’s important to consider the purpose of the recording. Music production is usually done with at least a 44.1kHz sample rate and a 16-bit depth. WAV and AIFF files are typically the file formats used for the master recording. When later compressed to MP3, as mentioned before, a bitrate between 192kbps and 320kbps is used to maintain the highest quality possible after compression. When a digital recorder is being used for another purpose, such as recording a conversation, other settings may optimize the performance and memory of the unit while still maintaining a high enough quality.

Whenever a smaller sample rate, bit depth or bitrate is used, the recording will always take up less space on the memory of the recorder. This can be very important to someone who may need to leave the recorder on for long periods of time. When capturing audio evidence, a recorder may need to be left on for hours or even days. If this is the case, and a lower quality file needs to be used, it is important to know how to go about maintaining quality while optimizing the memory.

Options and Limitations

While the range of human hearing covers up to 20kHz, fundamental frequencies of voice do not fall in the higher end of the frequency range. The human voice is strongest in the 1kHz to 4kHz frequency range. Because of this, it is possible to capture a completely audible and intelligible recording of people talking with a sample rate of only 22kHz. This would mean the highest frequency recorded would be 11kHz, which is still much higher than the most important frequencies in the voice. Some recorders can even be set to an 8kHz sample rate. While this does save a lot of space on the recorder, this means the cut off frequency would be 4kHz. This may be acceptable for some applications but may also cut down on the clarity of the voices. When a large amount of background noise is present, the higher frequencies between 4kHz and 10kHz can add some needed clarity to the voices. It is always a good idea to test the different sample rates before using them to make sure that the quality will be adequate for its purpose.

When trying to optimize the memory on a digital recorder, it is almost always a good idea to use a lossy compression format, such as an MP3. This means that determining the bitrate rather than the bit depth will be a factor in the size of the recording. As mentioned before, a bitrate between 192kbps and 320kbps is often very good quality for an MP3. When recording only a voice in which the content of the recording rather than perfect quality is the concern, lowering the bitrate can be very helpful for conserving space. One should be cautious when lowering the bitrate because the data compression may begin to affect the intelligibility of the recording. When too much compression is introduced, digital noise become easier to hear, which can sometimes cover up the desired signal. I have heard 32kbps recordings that had so much added digital noise that the much of the conversation in the recording had become unintelligible.

In summary, digital audio quality is determined by its sample rate and bit depth or bitrate. There are many options for these settings and not all of them may result in a good quality recording. It is always important to check these settings and be aware of the limitations each setting comes with before beginning a recording. Take into account the content of what you are recording and the quality of audio that is needed. The better you know your digital recorder, the more effective it becomes.

 

Authentication of Digital Audio Recordings

November 11th, 2014

AuthenticationOne of our day to day activities as audio forensic experts is authenticating digital audio evidence. When one of the parties in a litigation believes that an audio recording was tampered with or edited, an audio forensic expert is brought in to investigate the recording. When we authenticate an audio recording, the first step is to establish chain of custody. While it is the first step, chain of custody does not, in and of itself, establish a recording as being authentic. I have seen audio evidence that was not authentic and was stored in a digital audio recorder. So why is audio authentication so important? What should an audio forensic expert be aware of when examining audio evidence? What is the process of examining and authenticating audio evidence? I am going to answer these questions and more in the following post.

A majority of audio recordings we are hired to authenticate are created on digital audio recorders or smartphones using a recording app. These devices are easily concealed in a pocket or purse. They come in many shapes and sizes. They record various formats. One of the first steps an audio forensic expert must take when authenticating a digital audio recording is to become familiar with the equipment that created the recording

Importance of Authentication

The authentication process determines whether or not the audio recording in question has been tampered with. In this age of digital audio, edits can be made and covered up very easily. There are free versions of audio editing software – such as Audacity – which are available on line and can make edits that alter the events or conversation that originally occurred in digital audio recordings.

In the last 30 days, of all the audio authentication cases I was assigned, I found two had been edited. Both of the recordings were downloaded to a computer, edited, then played back and re-recorded through desktop computer speakers using a digital audio recorder. Most of the time, if an audio recording is edited after downloading to a computer and before authoring a CD, the editing can be detected in the digital recordings meta data. During the forensic authentication process, the software that created the edits will be detected in the HEX information of that edited recording.

If audio evidence is found to be altered, it should be ruled inadmissible in court because it is not an accurate representation of the events that occurred.

So what should the audio forensic expert be aware of during the authentication process?

First, establish and determine the chain of custody. If the expert is able to retrieve the evidence from the original source, in most cases that will automatically create and establish a chain of custody. Or, provide clues of tampering if the recording was edited and re recorded. If it’s not possible for the forensic expert to retrieve the recording, then the forensic expert must carefully go through all of the documents and reports that arrived with the evidence. Sometimes a chain of custody log from law enforcement will be included, which will strengthen the authenticity of the audio evidence. But if the chain of custody cannot be established, the forensic examiner must rely on other techniques as well as their own expertise to determine the authenticity of the evidence. If further investigation reveals more inconsistencies in the recording and metadata, more often than not that recording is determined to be altered.

Digital audio recorders aren’t the only equipment that record audio evidence. CCTV surveillance systems, as well as most other digital video recorders, will include both audio and video in the recordings. As an Audio and Video Forensic Expert, I often work with both the video and audio from these recordings. When I receive digital media evidence that includes sight and sound, I analyze both audio and video using separate forensic processes. I have come across cases in which the video was unedited but the audio had been tampered with. In this case, I was unable to authenticate the evidence because a chain of custody could not be established. Plus, there were anomalies in the audio that could be measured, heard and documented.

Process of Examining and Authenticating Audio Evidence: Critical Listening

One of the first steps that I take when audio evidence arrives at our lab, I listen critically to the entire recording a number of times. During this process I note unusual sounding sections in the recording which are called anomalies. I take notes and place markers using the forensic software so that I can find them later and include them in my forensic report.

These unusual sounding sections can be changes in the background ambience, inconsistent speech pacing and wording as well as changes in the noise floor. The noise floor is a series of natural and electronic sounds that should be consistent throughout the recording. Noise is defined as any sound source signals like hiss, hums, wind, HVAC and other sounds that are not part of the intended recording.

Critical listening must be the first step to become familiar with the audio evidence. If an edit is discovered during the critical listening phase, they are usually in the form of abrupt changes. Detecting these changes is not easy and comes with experience.

It’s important for the forensic expert to put themselves in a quiet, isolated room during critical listening so as to avoid any outside disturbances. The quiet environment enhances the critical listening focus. High quality, professional grade monitoring headphones and high quality studio monitors (speakers) are best for critical listening analysis of digital audio recordings. Professional quality headphones and speakers will have the flattest frequency response, which means they produce neutral and natural sound. This is very important for the forensic expert because subtle boosts and cuts in frequencies can impact the analysis of the digital audio recording.

Sometimes frequencies may be more audible in headphones and sound clearer to the forensic expert while other frequencies may be better heard through speakers. When the forensic expert is examining audio evidence for authentication, it is important to use both headphones and speakers to hear every aspect of the recording.

In some audio evidence I have examined, I have been able to hear a second noise floor in the recording. This usually occurs when a recording is played through speakers or an auxiliary cable into another recorder. The original noise floor from the recording is heard along with the second noise floor created from the second recording.

Electronic Measurement

After critical listening, the forensic expert must use electronic measurement to examine the audio evidence. This is done by noting the prominent frequencies in the voices or other sound source and the noise floor. The levels of the recording and of the different frequencies can be measured as well. Tools such as spectrograms, frequency analysis windows and level meters are very helpful for observing and collecting this information. The expert should note the frequency range of the overall recording, the voices or conversation and the noise floor or extraneous sounds in the recording.

If the frequency range of a voice suddenly becomes larger or smaller or shifts in frequency range, that can be a sign of an edit. Sudden, unexplained changes in the noise floor level as well as the sudden presence of another background noise can also be a sign of an edit. As I mentioned before, I have come across recordings in which I could hear two noise floors. This can often be measured and seen in a spectrogram and a frequency analysis panel.

Visual Inspection

Visually inspecting the audio wave form and spectrogram is the next step in authenticating the audio. This goes hand in hand with the electronic measurement as the forensic expert analyzes the physical wave properties and frequency information. Waveforms are continuous and smooth when examined very closely. Even a quick, loud sound like a clap will have a smooth, continuous wave. If there are sudden breaks in the waveform of a recording, these are signs of editing. The expert should also pay close attention to the phasing of the waveform. This can also been seen when visually zooming in to the waveform. If the waveform of the recording is suddenly inverted, this can also mean an edit was made.

The spectrogram will show the full frequency spectrum with warmer or colder colors representing the strength of that frequency. The noise floor can be seen very clearly in this view, helping to identify breaks in the sound. All recordings have some noise floor, even if they are almost inaudible. When viewing the spectrogram, any breaks in the noise floor may be signs of an edit. Changes in the volume of the noise floor can also be a sign of an edit.

Analyzing Metadata in Digital Audio Evidence

When I first began working as an Audio Forensic Expert, most of my work was with analog audio evidence in the form of mini, micro and standard audio cassettes. I did have some cases where reel to reel tape was used. Today almost all recordings are done digitally, there is additional information that can be analyzed when performing an audio authentication. Digital audio recordings contain metadata which reveals information about how the recording was made and the type of equipment that created the recording. If a recording was loaded into a software program capable of performing edits, there will often be a footprint left in the recording HEX information showing what software was used.

When examining the digital information, it is necessary to create an exemplar recording to compare the metadata with the original. An exemplar is a recording that is made in conditions that are as close to the original recording as possible . The exemplar is made on the same kind of audio recorder and, if possible, the same environment. Using this exemplar, the forensic expert can compare the metadata and HEX information of the two files. If there are inconsistencies in the data, that can also be a sign of tampering.

For a forensic expert to authenticate a piece of audio evidence, the expert must prove beyond any doubt that the recording is in its original form and has not undergone any tampering. If a piece of evidence is not authentic, it should not be used in court because it may be incomplete or altered to purport events that did not occur.

Hopefully this post helped inform you about the authentication of digital audio recordings. If you have any questions, email us at primeauforensics@gmail.com, or give us a call at 800-647-4291.

New Audio Recording on Michael Brown Shooting: Real or Fake?

August 26th, 2014
I was contacted today by two media outlets regarding an audio recording that was released last night to CNN. The recording is alledged to be a video chat with the gunshots that killed Michael Brown in the background.

I began my forensic investigation by researching the weapon Officer Wilson used to kill Michael Brown. The weapon is a Sig Sauer P229. Next I found a video on YouTube of the Sig Sauer P229 being fired. When conducting forensic comparison, I conclude the weapon in the recording matches the Sig Sauer P229. However, there are other much bigger issues.

Why did this person wait so long to release this recording? You would think he would go to authorities immediately! Next, why did we receive only a portion of an obviously longer recording? Then there is the fact that it sounds like he is reading instead of conversing with his girlfriend like he said he was? Plus, don’t you think he would stop when he hears gunshots? I know I would. Why does he continue to read right through the shots fired?

Seems like a fake recording to me. Something to get attention the day after Michael Brown was laid to rest. Don’t you think the timing is odd for this recording to be released?

Listen for yourself and let me know what you think. Let’s use #MichaelBrownAudio for discussing on Twitter.

 

Audio clip: Adobe Flash Player (version 9 or above) is required to play this audio clip. Download the latest version here. You also need to have JavaScript enabled in your browser.

The Structured Approach to Objective Audio Enhancement

July 9th, 2014

audio_waveformThe audio enhancement processes that I have learned are some of the accomplishments of which I am most proud as a forensic expert. Audio enhancement is both an art and a science; and as an audio forensic expert with 30 years of experience, I can tell you with confidence that no two assignments are the same. This knowledge has helped me develop a structured approach to objective audio enhancement.

In the following post I would like to help you better understand proper audio enhancement techniques through an objective and structured approach. On average, I enhance between 200 and 300 audio recordings per year. For each assignment, I use the knowledge and skills I have gained from past experiences to effectively enhance the recording. I believe I have developed a strong understanding and talent for audio enhancement.

When I first receive an audio recording from a client, I begin my enhancement process by listening through the recording several times. Critical listening is key for identifying different sections of the recording. When I refer to sections, I mean portions of the audio that have different characteristics such as levels, frequency ranges, or signal to noise ratios. For example, the first section may have two people talking quietly with a lot of street and car noise in the background. The next section may have a more audible conversation with a train passing far off in the distance. The third section may have no background noise at all but the lower frequencies of the people talking are suddenly louder. Each section of the audio recording has different characteristics and will need different processes to correctly enhance them.

Most audio editing software allows you to add a marker to the timeline based on your cursor’s current location. During playback, using a hotkey relative to the software, I can add markers while listening through the recording in order to identify the in and out of each section. This can ensure that I do not use a processor that may hinder other portions of the audio. Once the sections have been established, I can apply different plugins to each section as needed. 

Understanding the different tools used in both analogue and digital audio editing laid a strong foundation for my career as an audio forensic expert. For example, what audio enhancement tool should I begin with? What order should I apply the processors to acquire the best results?  Should I start with noise reduction or equalization? Is compression or normalization more applicable to this audio recording? These are important questions to consider when beginning the enhancement process. The plugins I use are based on the critical issues I hear in each section. The order of the processors can be key in producing a clean and balanced product. 

Typically noise reduction will be the first step in the structured approach. This prevents the noise from becoming an issue in further processing. Compression will usually be applied next to raise and balance the level of the section or overall recording.  Equalization can now be applied to the less noisy, balanced signal. Gates and further compression can also help remove unwanted sound or boost desired sound. While this is a good structure to follow, it may not be right for every situation. If there is an exceptional amount of background noise, a gate can be helpful before most of the other processors, especially compression. Occasionally equalization is also better as the first executed process. By drastically cutting a small range of frequencies, unwanted overtones in the human voice can be removed from further processing. Each recording can require any number of processors to reach the desired results; in some cases I may add as many as ten different plugins before I am satisfied with the results.  

Many of our clients at Primeau Forensics will say that they attempted to enhance their audio recording on their own and were unsuccessful. I explain that the audio enhancement process requires experience as well as a structured, scientific approach in order to produce effective results. Audio editing software is only a tool used in the enhancement process and owning a program does not give you the experience and skills necessary to enhance audio recordings like a professional.  

The structured approach to objective audio enhancement comes from experience. It is based on years of ‘hands on’ work with audio enhancement as well as observing sound recordings and the critical issues that interfere with the desired sounds. Please contact Primeau Forensics for your free consultation.  

1-800-647-4281

primeauforensics@gmail.com

An Accurate and Affordable Approach to Audio Enhancement

June 10th, 2014

oscilloscopeThe audio enhancement process is the number one forensic activity at Primeau Forensics. Audio enhancement, or sound enhancement, questions and assignments come into our offices daily from around the world. Audio enhancement helps people better understand words that were recorded but not clearly heard.

Last November I was asked by Jeff Morley to combine two versions of the Air Force One recordings from the day John F Kennedy was assassinated. Once my team and I had the recordings combined, the next step was to work on the enhancement process. 

As an audio forensic expert, audio enhancement is one of my favorite forensic activities. This is likely because when I started my career as an audio engineer, one of my first assignments was with the FBI. The experience was extremely rewarding because the two Detroit agents that came in to our recording studio, Ambience Recordings, were very appreciative and complimentary. I took an audio recording and used tools to reduce the unwanted background noise and enhanced the speaking portion of the recording. 

Audio enhancement is both an art and a science. It is an art because as forensic experts, we have tools like noise reduction, equalizers and compressors we use to create with similar to an artist who has paint, brushes and a canvas. We use these tools and artistically repair sound from sounding poor to enhanced and clear to better understand the speaking portion of the recording. 

Audio enhancement is a science because the tools have to be scientifically calculated and applied in specific orders depending on the experimentation with the order of application and the results from each application. I find myself using ‘control Z’ quite often during sound enhancement processes. 

Clients from around the world, including police departments and private individuals,  use digital pocket recorders to document and preserve a confession or other event in order to refer back to that event at a later date. The problem is that some of the time their recording does not go as planned. Background noise interferes more than planned because recorders pick up unwanted sound. Digital audio recorders do not record in the same manner that our ears perceive sound. When the digital pocket recorder is taken back to have the recording downloaded to a computer, the unwanted background sound is much more obvious then when the recording was created.

This is where our services as an audio forensic expert are sought out. After 30 years, we have become quite good and pretty quick at enhancing audio. Our speed and accuracy saves our clients money because many forensic experts take long periods of time applying various tools by trial and error. We, on the other hand, have the ability to recognize a noise situation and determine the order of processing necessary for audio restoration in a short period of time. 

In fact, we have started a service that accommodates our clients financially. Clients often have much higher than normal audio enhancement expectations. They hope the impossible can be made possible. Even the best forensic experts at Primeau Forensics cannot restore all sound to our client’s expectations.

This is why we have implemented a preliminary investigation process. This process allows us to send a sample of the restored recording to our clients to show them what is and is not possible. That way we can learn for a lesser rate if we can meet their expectations for audio enhancement. I am proud to say that in many cases we meet and even exceed their expectations. 

Techniques for Testifying Telephonically

May 9th, 2014

I recently testified telephonically for the United States Army. During my testimony I realized that there are several ingredients to a testimony by telephone. The following blog post will outline what I consider very important tips when testifying via telephone.

First, like all testimony, do your homework. Read all documents supplied repeatedly and nearly memorize them. Like testifying live, the lawyers, prosecutors and judges want to know what you are looking at while testifying. That is not to say looking at notes is bad, but instead to stress the importance of connecting your communication with the notes that are involved in the case. Remember, when on the phone, you are using only 20% of your communication tools. Body language and facial expressions are not being used during telephone testimony.

This segues nicely into the next tip; use voice tone and inflection to your advantage. For example, when we smile it lights up our face as well as our voice (if we are speaking). When testifying telephonically keep in mind the smile factor. Pay close emphasis to your voice tone and inflection when you are speaking since the majority of your other communication tools are not present. When making a strong point, use pausing and voice inflection to help the court feel your message since they cannot see you delivering your message.

Third, like when providing a testimony in person, it is crucial to rehearse with your lawyer and client. By rehearsing I mean practice a direct line of questioning as well as anticipate cross examination questions. Some of the lawyers I have worked with in the past insist on skipping this very important step. At Primeau Forensics, we take every step to make sure we are available for both our clients and lawyers during our time working together. Before the testimony we take extra care to organize, practice and rehearse; that way when it comes time to testify, there will be little to no mistakes, stutters or hesitations.

Remember, when testifying telephonically, the first rule is to always tell the truth. Keep in mind the words you are saying and the way you are saying them. Remember, the judge and jury are listening and evaluating you by your words because they cannot see your face or body language.

download-cv


forensic transcript
forensic files

Our demo video is coming soon!


expert witness