|
Post by The Tracker on Jan 13, 2006 19:29:41 GMT -5
FORENSIC VOICEPRINTS By Katherine Ramsland Chapter 1 The Origin of Voiceprints Voice analysis for the KGB? That's what political prisoners with special skills are forced to do in Alexander Solzhenitsyn's fact-based novel, The First Circle. Although imprisoned, these scientists have a unique position in Stalin's Russia. They live in a penal institution that doubles as a scientific research center and their assignment is to develop voiceprint technology. While the Russian secret police analyze phone calls in Germany, the technicians are pressed to figure out how to scientifically measure the individuality of the human voice. The novel offers a fascinating glimpse into the early days of this technology, but it was not in 1949 Russia where it all began. The idea that someone could be identified by the sound of his voice had its origins in the work of Alexander Melville Bell (father to Alexander Graham Bell). Over one hundred years ago, he developed a visual representation of what the spoken word would look like. It was based on pronunciation and he showed that there were subtle differences among different people who said the same things. His son later joined him in using the system to help deaf people to speak. Then in 1941, the laboratories of Bell Telephone in New Jersey produced a machine—the sound spectrograph—for mapping a voice onto a graph. It analyzed sound waves and produced a visual record of voice patterns that were based on frequency, intensity, and time. Acoustic scientists used it during World War II, as seen in Solzhenitsyn's novel, to attempt to identify enemy voices on telephones and radios. However, with the war's end, the urgency for this technology diminished and little came of it until later. Voiceprint technology began to get notice for criminal investigations in the early 1960s when the New York City Police Department received numerous bomb threats by phone against major airlines. Stymied, the FBI asked Bell Labs to help. Lawrence G. Kersta, one of their senior engineers, was assigned the task of figuring out a method of identification that would stop the calls and bring the perpetrators to justice. He was a physicist who had worked with the sound spectrograph in its early days. It took him more than two years and the analysis of over 50,000 voices, but he managed to offer a technique that he claimed tested at 99.65% accuracy. He had even brought in professional mimics to try to fool the machine, but try as they might to imitate someone else's voice, the mimics showed up in the graph as quite dissimilar from the original voices. Kersta eventually broke away from Bell Laboratories to market the machine on his own. Then in 1966, the Michigan State Police started to work on a practical application of voiceprint technology in criminal investigations. They formed a Voice Identification Unit and hired Kersta to train these officers. Their intent was to use it to assist with ongoing cases, but it wasn't long before its legal weight was reviewed in a courtroom. Voiceprint technology came into the American courts in the 1960s, and judges were divided on whether or not to admit it as scientific evidence. There was little research to support it, there were few people who really could be called technical experts, and linguists testified against one another on its viability. The first case was in military court, United States v. Wright, and that began the judicial controversy. One court ruled the technology admissible, but a dissenting judge wrote a detailed opinion on why it should not be considered scientifically acceptable. The New Jersey Supreme Court was the first non-military court to make an appellate review, in State v. Cary. Courts in New York and California had admitted this type of testimony, so the New Jersey justices remanded the case to check the accuracy of the equipment. Another appeal came their way and they ruled that it was too early to tell whether this method was reliable. After several more times back and forth, with no new scientific support, the voiceprint identification evidence was excluded. The reason for this, and the subsequent case history, are supplied in detail in Section Five. First, Let's look at how the sound spectrograph worked in a murder investigation.
|
|
|
Post by The Tracker on Jan 13, 2006 19:31:00 GMT -5
Chapter 2 The Voiceprint of a Killer The year was 1971. Neil LaFeve, an amiable but law-abiding game warden in Wisconsin was found murdered on September 24th, on his 32nd birthday. That afternoon, he had been out in the woods posting signs and had planned to finish long before the party that his wife had organized for him. When he failed to show up, his wife grew worried and phoned his boss. They discussed it together, but there was no reason they could think of that Neil might still be out in the woods. LaFeve's boss drove out to have a look. He noticed that all the signs had been posted, so when darkness came and there was still no indication that LaFeve was returning, he called the police. They searched through the night, but gave up without finding the missing warden. In the morning, the search party came across LaFeve's truck. It was empty and the door was ajar. Things looked bad and only got worse when they found a large amount of blood not far away. Another searcher picked up some broken sunglasses and two spent shells from a .22 rifle. From there, more signs of a wounded man formed a trail: human body matter, a tooth, blood, and bone fragments. They felt certain they would not find him alive. Finally the search party reached a spot that looked like it had been recently dug up. The police got shovels and soon they had located Neil LaFeve - without his head. Another freshly dug spot nearby, though much smaller, yielded his head. It had been hacked off with a blunt instrument---a shovel or spade---and two bullets were imbedded in the skull. The coroner also found several bullets in the corpse. The first step was to determine if LaFeve had any enemies. The officers in charge of the investigation looked through a list of men that LaFeve had arrested for poaching, because these men could have a vendetta. The brutality of the attack indicated rage or revenge, not just a random killing. All of the men who had been convicted of hunting illegally on those grounds were located and interviewed on tape, and a few were asked to submit to polygraph exams. However, there was one man who refused to cooperate: 21-year-old Brian Hussong. LaFeve had arrested him several times, yet he had continued to poach. Hussong had no alibi for September 24th and he resisted all attempts to clear up the murder mystery. He seemed a likely suspect. Sergeant Marvin Gerlikovski was in charge, so he got a rare court order that allowed him to put a wiretap on Hussong's house. He took the extra precaution of recording everything that was said, which paid off in a way he didn't expect. It wasn't long before Hussong got on the phone to get his grandmother to hide his guns and give him an alibi. She appeared to cooperate, so Gerlikovski sent detectives to her house. Flustered, she led them straight to the hiding place. Ballistics experts confirmed a match between the .22 rifle and the bullets found in LaFeve's body, which was enough evidence to place Hussong under arrest. Gerlikovski then sent the tapes he had made to Michigan's Voice Identification Unit—at that time the best in the world for this type of procedure. The leading experts in voiceprint analysis had trained these officers. Ernest Nash examined the tapes, gave his opinion, and ended up serving as an expert witness during Hussong's trial. However, it was not Hussong's voice that he testified about, but that of Hussong's grandmother. She had denied saying that she had hidden the guns, so Nash explained how he could match her voice to that of the voice on the tape. He then used his laboratory results to affirm that she was definitely the person speaking to her grandson on the tape. The jury listened to the tapes again, and after less than four hours of deliberation, they returned a guilty verdict of first-degree murder that gave Hussong a life term in prison. So just what is it about the human voice that makes it electronically measurable?
|
|
|
Post by The Tracker on Jan 13, 2006 19:32:15 GMT -5
Chapter 3 The Spectograph and the Human Voice Anyone who talks on a phone or tape recorder is fair game for voice analysis, especially if they have criminal intent. Increasingly, more law enforcement officers are getting trained in voiceprint analysis, and with the development of computer and digital spectrogram technology, the procedure is becoming widely used. Lawrence Kersta noted that each person's voice has a unique quality that can be mapped on a graph. The individuality derives primarily from differences in physical vocal mechanisms. One person's vocal chords, no matter how similar they might look, process sounds differently than someone else's. The size and shape of someone's vocal cavity, tongue, and nasal cavities contribute to this, as well as how that person coordinates lips, jaw, tongue, and soft palate to make speech. No combination of these things is like any other. That means that our voices are sufficiently unique to make personal identification based on voice sounds possible. Although Kersta also believed that an individual's voice does not change over his or her lifetime, other experts have disputed him on this point. If the body changes, so does the voice. Even where a person lives can effect voice changes, as well as illness, stress, aging, and other factors. Nevertheless, Kersta maintained that the essential qualities of the voice remain constant. He felt that he finally proved this in one of the most famous cases involving the spectrograph: that of the reclusive Howard Hughes. In 1971, a man named Clifford Irving came to New York to cut a deal for what he claimed was Hughes' autobiography, ghosted by him. He had letters that he insisted were written by Hughes and experts soon authenticated them. The publisher McGraw-Hill bought into his claim, advancing him $765,000 and announcing their intent to publish the book. Eventually Irving turned in a 1200 page manuscript. It was difficult to ascertain whether Hughes had actually authorized this transaction since for the past fifteen years he had been exceedingly elusive. That Irving had letters from him seemed a good indicator that they knew each other. Several people who had known Hughes read the manuscript and felt convinced that it was genuinely his story. However, he finally surfaced from his retreat on Paradise Island in the Bahamas to renounce the book. Hughes claimed that he had never met Clifford Irving and that the whole thing was a fake. He added that he did not know where Irving had gotten his information. However, he was not willing to make his renunciation in person. He agreed only to do this by phone. That meant that he could be identified only by his voice—how it sounded and what he said. A group of reporters familiar with him from his early days was assembled by NBC in Los Angeles to ask him questions for two hours. Their purpose was to authenticate the voice on the phone as that of the famous, eccentric billionaire, and they were to ask some key questions that would trip up an imposter. The man on the phone responded in convincing detail. He talked about such things as the make of his plane and trips that he had made, but he stumbled when asked about the good luck charm that a woman had presented to him before his 1938 trip around the world. He said that he could not recall the incident, but moments later he did: She had placed chewing gum on the tail of his plane. This entire phone conversation was recorded and as they listened again, the reporters all believed that Hughes had been the man on the phone. That meant that Irving was a fraud. Irving defended himself by insisting that the person who had called was the imposter, but NBC had hired Lawrence Kersta to make a voiceprint analysis. He measured pitch, tone, and volume to compare the voice pictures on a line-by-line basis, comparing a recording of a speech that Hughes had made in 1947 with the recordings from the interview. Finally he announced that the man who had spoken to reporters was Howard Hughes. Even one of Kersta's most vocal critics, phonetics professor Peter Ladefoged, admitted that the recordings were remarkably identical. Irving was arrested and convicted of forgery. He repaid the publisher and was sentenced to thirty months in prison. Since the recordings had been made nearly a quarter of a century apart and Hughes' voice had deepened, there had been concern that changes would make the reading impossible. However, the spectrographic patterns proved to be impressively similar. This result further convinced Kersta that the inherent uniqueness of an individual's voice remains constant. Spectrographic analysis of the human voice has made a similar impact in other criminal cases, so let's see more specifically how an interpretation is made.
|
|
|
Post by The Tracker on Jan 13, 2006 19:34:05 GMT -5
Chapter 4 How It Works Many law enforcement laboratories are equipped with at least one sound spectrograph, although there are several types to choose from. This machine plots the frequency of a complex sound according to time and intensity. Its function is based on the idea that the human voice is produced by a combination of physiological structures and harmonics. The vocal column begins in the vocal folds and ends at the lips. The vocal folds function acoustically as a closed end so that the vocal column becomes a closed-tube resonator. The tension of the vocal folds determines the vibrational frequency. When a sound is produced, those harmonics nearest the resonant frequency of the vocal column increase in amplitude. If the shape of the mouth, throat, or lips changes, the frequencies vary with the change. The sound spectrograph converts the sound of a voice into a visual graphic display known as a voiceprint. The analog spectrograph has four parts: a magnetic tape recorder unit, a tape scanning device, a filter, and an electronic stylus that writes the information onto electrically sensitive paper. A high-quality tape is fastened to the scanning drum, which holds a 2.5 segment of tape time. The process takes about eighty to ninety seconds to complete. As the drum revolves, an electronic filter is activated that allows only a certain band of frequencies to get through to the recorder. These frequencies are translated into electrical energy that gets recorded by the stylus. As the process continues, the filter moves into increasingly higher frequencies and the stylus records the intensity levels of each defined range. The final print shows a pattern of closely spaced lines that represent 2.5 seconds worth of all of the distinguishable frequencies of that person's voice as it was taped. The horizontal axis on a voiceprint represents the parameter of time, registering how high or low a voice is. The vertical axis is the frequency. The degree of darkness within each region on the graph illustrates the degree of intensity, or the voice's volume. Two kinds of prints can be made: bar prints, which are utilized for identification, and contour prints, which help to file the prints in a computer. Recent developments include digital spectrographs that can be used with a computer for enhanced comparison and measurement, but some specialists still prefer the older analog model. Comparisons are made between voice samples and when sufficient similarity exists between one pattern and another, the voices are believed to have a high probability of originating from the same person. For forensic purposes, the voiceprint interpreter needs a recording of the suspect's voice (e.g., from an interview) to compare to the sample made in the context of a crime, such as an obscene phone call or taped conversation. Other people's voices, unrelated to the crime, are used for elimination factors (points of dissimilarity). Interpreters use two methods of identification: Aural: listening to the voice on tape to compare single sounds and series of sounds for similarities and discrepancies; the examiner also listens for breath patterns, inflections, unusual speech habits, and accents. Visual: reading the voiceprints to compare their appearances. First, the examiner evaluates the recording of the unknown suspect, to make sure it has sufficient quality and clarity for analysis. Then the examiner turns to the voices of the known person to ensure that the recording has similar clarity. The best test cases have the suspect repeat what was said on the "unknown voice" tape, or at least include as many of the same words as possible. The aural and visual methods are combined to come up with one of five conclusions: positive identification probable identification positive elimination probable elimination no decision. The highest standard requires the identification of twenty speech sounds that possess similarities. "Positive elimination" derives from twenty or more differences, and the rest fall on a spectrum in between. Some critics of this technology claim that it has never been adequately developed to prove that voiceprints are as individual as fingerprints. However, those who work closely with it on a regular basis insist that the spectrograph is highly accurate. Tom Owen, who runs Owl Investigations, Inc., has thirty-five years of experience in the recording arts and is a certified Voice Identification Examiner. He teaches at the New York Institute of Forensic Audio and offers specialized courses on audio and video analysis. He also consults with law enforcement agencies around the world on specific cases, and for more than twenty years has served as an expert witness in both criminal and civil proceedings. His agency has a fully equipped processing laboratory, which includes five different types of spectrograph machines for voice identification and speech enhancement. With Michael McDermott, Owen has written an extensive article on the history, methods, and forensic applications of voice technology, and he takes on fifty to sixty cases each year. "It's not uncommon," he says, "that at a murder scene or shooting, you have a tape made from a 911 call where the victim might have been calling for help, or else the person might have been on the phone talking to a relative. Someone shoots the person, the victim dies and the shooter doesn't realize that the machine was recording. I would get that tape and see if the intruder said anything before he shot the person. Sometimes we get results. Then there are civil incidents, like someone calling to threaten you. If you don't pay this money, he's going to damage your car or kill your pet. You also have divorce proceedings where recordings get made, and you have people who keep calling to say something and then hanging up. We can analyze those calls." In fact, in one case, a murderer himself called the police to offer the location of the body. He said that he was an acquaintance and used another man's name. That man was eliminated and the murderer identified through voiceprint analysis. Owen uses the full range of spectrograph analysis, but he admits that the technology could still be better. "You can't accurately print all the 256 shades on the gray scale," he says. "The printers have gotten better, but only the most expensive ones really get the full range of resolution, and it's often not worth the cost of such a machine." Recently he completed a study on twenty-five female voices of varying races and ages, doing a one-to-one analysis to determine the degree of error. The results were striking: "When you're comparing a known and an unknown voice using a verbatim exemplar [the samples contain the same verbal communication], there are no errors. That's ninety-nine percent of what we do today. We don't try to pick a voice out of a pack." Because voiceprints are generally used in cases where the accuracy rate is so high, Owen is confident that they make a real contribution to the legal process. However, the history of admissibility of voiceprints has mirrored what has happened in court with other technologies in their early stages. Courts are conservative and the sound spectrograph has had to prove itself.
|
|
|
Post by The Tracker on Jan 13, 2006 19:36:20 GMT -5
Chapter 5 Standards of Courtroom Admissibility Some background is important for understanding the admissibility of expert testimony on new technology, because several court cases have challenged and refined admissibility procedures. Throughout legal history, problems have arisen in the courtroom when novel information was presented that seemed difficult to evaluate for its accuracy, so rules were devised for deciding whether the evidence presented ought to have legal weight. Before examining specific spectrograph-relevant cases, a look at the evolution of judicial procedure regarding scientific evidence helps to understand what the courts have said about voiceprints. In 1923, the District of Columbia Court of Appeals issued an opinion that became the guideline for the admissibility of scientific evidence. In a case know as Frye, the defense counsel tried to enter evidence about a device that measured blood pressure during deception (a forerunner to the polygraph). The court decided that the thing from which the testimony is deduced must be "sufficiently established to have gained general acceptance in the particular field in which it belongs." It also had to be determined that the information was beyond the general knowledge of the jury. This Frye standard became general practice in most courts and continued to influence decisions for many years. It prevented courts from running mini-trials about the scientific evidence itself. However, critics claimed that the Frye standard excluded theories that were novel but nevertheless supported by evidence. In 1975 the Federal Rules of Evidence became effective and Rule 702 states: "If scientific, technical, or other specialized knowledge will assist the trier of fact to understand the evidence or to determine a fact in issue, a witness qualified as an expert by knowledge, skill, experience, training or education, may testify thereto in the form of an opinion or otherwise." However, the courts varied greatly in the way they followed this standard. The Frye test was more effectively replaced in many states by a standard cited in 1993 in the Supreme Court's decision in Daubert v. Merrill Dow Pharmaceuticals, Inc., which emphasizes the trial judge's responsibility as a gatekeeper. The court decided that "scientific" means having a grounding in the methods and procedures of science, and "knowledge" must be stronger than subjective belief or unsupported speculation. The judge has only to focus on the methodology, not on the conclusion, and also must decide whether the scientific evidence applies to the facts of the case. In other words, judges have to determine: whether the theory can be tested whether the potential error rate is known whether it was reviewed by peers and has attracted widespread acceptance within a relevant scientific community Whether the opinion is relevant to the issue in dispute . With voice spectrography, the National Academy of Science concluded in 1979 that, "The degree of accuracy and the corresponding error rates of aural-visual voice identification vary widely from case to case." In other words, the graph interpretation was viewed as lacking in standards, the equipment quality could be suspect, and the conditions under which a voice sample is taken vary so much as to have a significant impact on the identification procedure. The same goes for the knowledge and skill of the interpreter. That meant that there appeared to be no adequate scientific basis for bringing the results into court as sufficiently reliable for judges or juries to make just decisions based upon it. Nevertheless, this technology has been introduced in court in every state and only Maryland does not allow spectrograph evidence. Despite rulings in states like New Jersey and California, which used Frye as a guideline, courts in Florida admitted spectrograms in 1972 as corroborative evidence: As long as other methods of identifying a suspect were used, voiceprints could be brought in. Eventually courts across the country began to yield as more and more experts came in to verify the reliability of the machine. Courts were especially impressed with the testimony of Dr. Oscar Tosi, who initially had testified against the use of the spectrograph. After extensive research of his own, he had switched his position and declared it to be highly reliable. However, in a California case, People v. Law, the appellate judge reversed a conviction, which was gained with the help of a spectrogram, based on the fact that no study had been carried out to test the effects on voiceprints of disguised or mimicked voices. Nevertheless, courts eventually gave greater weight to the testimony of those experts who had direct empirical experience in the field than to academic critics. Voiceprint technology did not have to prove to be infallible to be accepted as evidence. In the case of Reed v. State, the definition of "scientific community" was broadened to include people who had sufficient scientific background to understand the process in question and form a judgment about it. That decision assisted admissibility. As voiceprint technology and interpretations become more widely used, increasingly more courts are responding favorably to it. Nevertheless, it remains an unsettled question, in particular when courts review older decisions based on the technology in its early stages. "People are generally misinformed about voice identification," Tom Owen says. "I've never failed to qualify as an expert witness, nor failed to be heard in a case. Those who say that this technology is not accepted really have no clue. The only courts that have rejected it, did so in the seventies on the basis of the Frye hearings. It's decided on a case-by-case basis, but the only scientific community that counts for consensus is the community that is engaged in doing the work. In every case where someone has offered counter-testimony, that person was a teacher from a local college who did not actually participate actively in spectrographic analysis. They should not be considered part of the scientific community that is doing the work." Owen also points out that the trial decisions in the seventies were based on testimony from technicians who only used the voiceprint. "Now law enforcement primarily uses the aural spectrographic method, which means we listen to the tape first and then do the spectrograph. The American College of Forensic Examiners, which now controls who gets certified, has taken the position that the only way to do this is the aural spectrographic method. You have to actually listen to the tape, not just look at the graph." Certain precautions are observed during trials that provide clear context for the evidence and that work to ensure that all such testimony is properly understood. Juries are allowed to see the voiceprint and hear the tape recordings. The other side scrutinizes the expert's qualifications and the machine's quality. In the end, the jury is generally instructed to assign whatever weight they want to the evidence. That means that a lot will depend on the experience and demeanor of the voiceprint expert. To be convincing he or she needs proper training. Let's take a look at the specific skills involved.
|
|
|
Post by The Tracker on Jan 13, 2006 19:38:45 GMT -5
Chapter 6 Voiceprint Analysis Expertise To be qualified as experts in voiceprint analysis, technicians must: complete a course of study on spectrographic analysis that generally runs from two to four weeks complete one hundred voice comparison cases under intense personal supervision by a known expert be examined by a board of experts in the field Since courts generally contest the methods of interpretation, not the actual accuracy or reliability of the spectrographic instrument, it is important that any spectrograph technician who testifies in court be highly qualified. The less training and experience the technician has, the more such testimony becomes vulnerable to serious questions by the judge and jury. All of the studies that have been done on spectrographic accuracy, including a 1986 FBI survey, show that those people who have been properly trained and who use standard aural and visual procedures get highly accurate results. The opposite is true where training and/or analysis methods are limited. Bringing such studies to the attention of the courts could help determine who is indeed an expert and could minimize some of the controversy and confusion that comes from misperception. Those who do the recordings for analysis must also be competent to operate the recording device, because the quality of the tape has great bearing on the interpreter's results. The skills involved in aural and visual voice interpretation include: Critical listening, with an ear for anomalies and the ability to audit foreground information as distinguishable from background Ability to check for tape tampering Experience reading magnetic tapes Ability to operate the spectrograph equipment, both for general results and for zooming in on specific patterns Ability to work with an investigative team In all likelihood, voiceprints will continue to play a key role in any investigation that involves voice evidence. As such, they will become part of the evidence brought into court. Like other technologies that once were resisted but are now fully admissible, voiceprints may soon have their day.
|
|